Hi, I''ve noticed a few csum mismatch messages, and a few failed xfstests: - 3.8.0-rc1 - defautl mkfs options - MOUNT_OPTIONS -- -o space_cache,noatime,inode_cache - test device: 40G - scratch device: 10G 091: --- 091.out 2011-11-01 10:31:12.000000000 +0100 +++ 091.out.bad 2013-01-03 21:07:29.000000000 +0100 @@ -1,7 +1,45 @@ QA output created by 091 fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 32768 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 32768 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -W +fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W +mapped writes DISABLED +truncating to largest ever: 0x12a00 +truncating to largest ever: 0x75400 +doread: read: Input/output error +LOG DUMP (35 total operations): +1( 1 mod 256): SKIPPED (no operation) +2( 2 mod 256): WRITE 0x62600 thru 0x6bdff (0x9800 bytes) HOLE +3( 3 mod 256): FALLOC 0x2e0f2 thru 0x3134a (0x3258 bytes) INTERIOR +4( 4 mod 256): TRUNCATE DOWN from 0x6be00 to 0x12a00 +5( 5 mod 256): READ 0x0 thru 0xdfff (0xe000 bytes) +6( 6 mod 256): FALLOC 0x7048 thru 0x9f54 (0x2f0c bytes) INTERIOR +7( 7 mod 256): WRITE 0x5ea00 thru 0x6e7ff (0xfe00 bytes) HOLE +8( 8 mod 256): READ 0x16000 thru 0x17fff (0x2000 bytes) +9( 9 mod 256): FALLOC 0x4957f thru 0x5298e (0x940f bytes) INTERIOR +10( 10 mod 256): SKIPPED (no operation) +11( 11 mod 256): WRITE 0x10a00 thru 0x173ff (0x6a00 bytes) +12( 12 mod 256): WRITE 0x53800 thru 0x5a7ff (0x7000 bytes) +13( 13 mod 256): WRITE 0x5ae00 thru 0x5afff (0x200 bytes) +14( 14 mod 256): READ 0x5d000 thru 0x66fff (0xa000 bytes) +15( 15 mod 256): SKIPPED (no operation) +16( 16 mod 256): READ 0x21000 thru 0x2bfff (0xb000 bytes) +17( 17 mod 256): SKIPPED (no operation) +18( 18 mod 256): READ 0x47000 thru 0x4ffff (0x9000 bytes) +19( 19 mod 256): WRITE 0x17600 thru 0x25bff (0xe600 bytes) +20( 20 mod 256): READ 0x3f000 thru 0x48fff (0xa000 bytes) +21( 21 mod 256): FALLOC 0xea89 thru 0x19800 (0xad77 bytes) INTERIOR +22( 22 mod 256): FALLOC 0x569aa thru 0x586ea (0x1d40 bytes) INTERIOR +23( 23 mod 256): WRITE 0x35c00 thru 0x453ff (0xf800 bytes) +24( 24 mod 256): SKIPPED (no operation) +25( 25 mod 256): SKIPPED (no operation) +26( 26 mod 256): READ 0x21000 thru 0x26fff (0x6000 bytes) +27( 27 mod 256): READ 0x5e000 thru 0x61fff (0x4000 bytes) +28( 28 mod 256): WRITE 0x6f600 thru 0x6f7ff (0x200 bytes) HOLE +29( 29 mod 256): READ 0x13000 thru 0x19fff (0x7000 bytes) +30( 30 mod 256): TRUNCATE UP from 0x6f800 to 0x75400 +31( 31 mod 256): READ 0x4000 thru 0xafff (0x7000 bytes) +32( 32 mod 256): SKIPPED (no operation) +33( 33 mod 256): FALLOC 0x31d49 thru 0x3c520 (0xa7d7 bytes) INTERIOR +34( 34 mod 256): FALLOC 0x2bbb3 thru 0x37ad8 (0xbf25 bytes) INTERIOR +35( 35 mod 256): READ 0x68000 thru 0x73fff (0xc000 bytes) +Correct content saved for comparison +(maybe hexdump "/mnt/a1/junk" vs "/mnt/a1/junk.fsxgood") I''m not quite sure if the messages match the test (best guess, neighbouring tests were fine): [102885.667444] btrfs csum failed ino 638 off 425984 csum 1842675109 private 2279232751 [102885.676804] btrfs csum failed ino 638 off 430080 csum 1842675109 private 1192041375 [102885.686094] btrfs csum failed ino 638 off 434176 csum 2297282744 private 1619428542 [102885.686103] btrfs csum failed ino 638 off 438272 csum 3709984297 private 2868627320 [102885.686112] btrfs csum failed ino 638 off 442368 csum 1504116677 private 1239355148 [102885.686121] btrfs csum failed ino 638 off 446464 csum 1957839041 private 3848200057 [102885.686129] btrfs csum failed ino 638 off 450560 csum 3836729483 private 2867416946 113: - it hung last evening and is still in that state, no disk or cpu activity, there were only the tests running - no process is in D state, no btrfs kernel thread is active - the only interesting process is PID TTY STAT TIME COMMAND 15585 pts/0 Sl+ 0:01 /root/xfstests/ltp/aio-stress -t 20 -s 10 -O -S -I \ 1000 /mnt/a1/aiostress.15188.4 /mnt/a1/aiostress.15188.4.20 \ /mnt/a1/aiostress.15188.4.19 /mnt/a1/aiostress.15188.4.18 \ [<ffffffff810af447>] futex_wait_queue_me+0xc7/0x100 [<ffffffff810affa1>] futex_wait+0x191/0x280 [<ffffffff810b1cb6>] do_futex+0xd6/0xbd0 [<ffffffff810b282b>] sys_futex+0x7b/0x180 [<ffffffff8195fe99>] system_call_fastpath+0x16/0x1b there are also csum mismatch messages: [103208.085544] btrfs csum failed ino 2998 off 2883584 csum 3942040493 private 2566472073 [103208.366484] btrfs csum failed ino 2990 off 2949120 csum 2166904896 private 2566472073 # btrfs inspect ino 2998 /mnt/a1 /mnt/a1/aiostress.15188.4.6 # btrfs inspect ino 2990 /mnt/a1 /mnt/a1/aiostress.15188.4.14 # lsof|grep aiostress|wc -l 342 but neither aiostress.<pid>.4.6 nor 4.14 are open -- Previously, the same machine went through the tests with following options: MKFS_OPTIONS -- /dev/sda9 MOUNT_OPTIONS -- -o space_cache,noatime /dev/sda9 /mnt/a2 MKFS_OPTIONS -- -m single -d single /dev/sda9 MOUNT_OPTIONS -- -o space_cache,noatime /dev/sda9 /mnt/a2 MKFS_OPTIONS -- -m single -d single --mixed /dev/sda9 MOUNT_OPTIONS -- -o space_cache,noatime /dev/sda9 /mnt/a2 MKFS_OPTIONS -- -m dup -d dup --mixed /dev/sda9 MOUNT_OPTIONS -- -o space_cache,noatime /dev/sda9 /mnt/a2 MKFS_OPTIONS -- /dev/sda9 MOUNT_OPTIONS -- -o space_cache,noatime /dev/sda9 /mnt/a2 MKFS_OPTIONS -- /dev/sda9 MOUNT_OPTIONS -- -o space_cache,noatime /dev/sda9 /mnt/a2 MKFS_OPTIONS -- /dev/sda9 MOUNT_OPTIONS -- -o space_cache,noatime,inode_cache /dev/sda9 /mnt/a2 MKFS_OPTIONS -- -m single -d single /dev/sda9 MOUNT_OPTIONS -- -o space_cache,noatime,inode_cache /dev/sda9 /mnt/a2 MKFS_OPTIONS -- -m single -d single --mixed /dev/sda9 MOUNT_OPTIONS -- -o space_cache,noatime,inode_cache /dev/sda9 /mnt/a2 MKFS_OPTIONS -- -m dup -d dup --mixed /dev/sda9 MOUNT_OPTIONS -- -o space_cache,noatime,inode_cache /dev/sda9 /mnt/a2 MKFS_OPTIONS -- /dev/sda9 MOUNT_OPTIONS -- -o space_cache,noatime,inode_cache /dev/sda9 /mnt/a2 The failed one has previously passed once, and it does not seem that it''s caused only by the inode_cache option. -- Other information: # btrfs fi df /mnt/a1 Data: total=4.01GiB, used=247.80MiB System, RAID1: total=8.00MiB, used=4.00KiB System: total=4.00MiB, used=0.00 Metadata, RAID1: total=1.00GiB, used=3.61MiB Metadata: total=8.00MiB, used=0.00 # btrfs fi df /mnt/a2 Data: total=8.00MiB, used=292.00KiB System, DUP: total=8.00MiB, used=4.00KiB System: total=4.00MiB, used=0.00 Metadata, DUP: total=1.00GiB, used=24.00KiB Metadata: total=8.00MiB, used=0.00 david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2013-Jan-04 13:01 UTC
Re: [bug] csum mismatches and failed xfstests with 3.8-rc1
Thanks Dave, On Fri, Jan 04, 2013 at 05:50:59AM -0700, David Sterba wrote:> Hi, > > I''ve noticed a few csum mismatch messages, and a few failed xfstests: > > - 3.8.0-rc1 > - defautl mkfs options > - MOUNT_OPTIONS -- -o space_cache,noatime,inode_cache > - test device: 40G > - scratch device: 10GJosef, are the problems you see with 083 coming on the scratch drive or the main disk?> > 091: > --- 091.out 2011-11-01 10:31:12.000000000 +0100 > +++ 091.out.bad 2013-01-03 21:07:29.000000000 +0100 > @@ -1,7 +1,45 @@ > QA output created by 091 > fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W > -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W > -fsx -N 10000 -o 32768 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W > -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W > -fsx -N 10000 -o 32768 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W > -fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -W > +fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W > +mapped writes DISABLED > +truncating to largest ever: 0x12a00 > +truncating to largest ever: 0x75400 > +doread: read: Input/output error > +LOG DUMP (35 total operations): > +1( 1 mod 256): SKIPPED (no operation) > +2( 2 mod 256): WRITE 0x62600 thru 0x6bdff (0x9800 bytes) HOLE > +3( 3 mod 256): FALLOC 0x2e0f2 thru 0x3134a (0x3258 bytes) INTERIOR > +4( 4 mod 256): TRUNCATE DOWN from 0x6be00 to 0x12a00 > +5( 5 mod 256): READ 0x0 thru 0xdfff (0xe000 bytes) > +6( 6 mod 256): FALLOC 0x7048 thru 0x9f54 (0x2f0c bytes) INTERIOR > +7( 7 mod 256): WRITE 0x5ea00 thru 0x6e7ff (0xfe00 bytes) HOLE > +8( 8 mod 256): READ 0x16000 thru 0x17fff (0x2000 bytes) > +9( 9 mod 256): FALLOC 0x4957f thru 0x5298e (0x940f bytes) INTERIOR > +10( 10 mod 256): SKIPPED (no operation) > +11( 11 mod 256): WRITE 0x10a00 thru 0x173ff (0x6a00 bytes) > +12( 12 mod 256): WRITE 0x53800 thru 0x5a7ff (0x7000 bytes) > +13( 13 mod 256): WRITE 0x5ae00 thru 0x5afff (0x200 bytes) > +14( 14 mod 256): READ 0x5d000 thru 0x66fff (0xa000 bytes) > +15( 15 mod 256): SKIPPED (no operation) > +16( 16 mod 256): READ 0x21000 thru 0x2bfff (0xb000 bytes) > +17( 17 mod 256): SKIPPED (no operation) > +18( 18 mod 256): READ 0x47000 thru 0x4ffff (0x9000 bytes) > +19( 19 mod 256): WRITE 0x17600 thru 0x25bff (0xe600 bytes) > +20( 20 mod 256): READ 0x3f000 thru 0x48fff (0xa000 bytes) > +21( 21 mod 256): FALLOC 0xea89 thru 0x19800 (0xad77 bytes) INTERIOR > +22( 22 mod 256): FALLOC 0x569aa thru 0x586ea (0x1d40 bytes) INTERIOR > +23( 23 mod 256): WRITE 0x35c00 thru 0x453ff (0xf800 bytes) > +24( 24 mod 256): SKIPPED (no operation) > +25( 25 mod 256): SKIPPED (no operation) > +26( 26 mod 256): READ 0x21000 thru 0x26fff (0x6000 bytes) > +27( 27 mod 256): READ 0x5e000 thru 0x61fff (0x4000 bytes) > +28( 28 mod 256): WRITE 0x6f600 thru 0x6f7ff (0x200 bytes) HOLE > +29( 29 mod 256): READ 0x13000 thru 0x19fff (0x7000 bytes) > +30( 30 mod 256): TRUNCATE UP from 0x6f800 to 0x75400 > +31( 31 mod 256): READ 0x4000 thru 0xafff (0x7000 bytes) > +32( 32 mod 256): SKIPPED (no operation) > +33( 33 mod 256): FALLOC 0x31d49 thru 0x3c520 (0xa7d7 bytes) INTERIOR > +34( 34 mod 256): FALLOC 0x2bbb3 thru 0x37ad8 (0xbf25 bytes) INTERIOR > +35( 35 mod 256): READ 0x68000 thru 0x73fff (0xc000 bytes) > +Correct content saved for comparison > +(maybe hexdump "/mnt/a1/junk" vs "/mnt/a1/junk.fsxgood") > > I''m not quite sure if the messages match the test (best guess, neighbouring > tests were fine): > > [102885.667444] btrfs csum failed ino 638 off 425984 csum 1842675109 private 2279232751 > [102885.676804] btrfs csum failed ino 638 off 430080 csum 1842675109 private 1192041375 > [102885.686094] btrfs csum failed ino 638 off 434176 csum 2297282744 private 1619428542 > [102885.686103] btrfs csum failed ino 638 off 438272 csum 3709984297 private 2868627320 > [102885.686112] btrfs csum failed ino 638 off 442368 csum 1504116677 private 1239355148 > [102885.686121] btrfs csum failed ino 638 off 446464 csum 1957839041 private 3848200057 > [102885.686129] btrfs csum failed ino 638 off 450560 csum 3836729483 private 2867416946I think fsx leaves the bad file, you can test the inode number?> > > 113: > - it hung last evening and is still in that state, no disk or cpu activity, > there were only the tests running > - no process is in D state, no btrfs kernel thread is active > - the only interesting process is > > PID TTY STAT TIME COMMAND > 15585 pts/0 Sl+ 0:01 /root/xfstests/ltp/aio-stress -t 20 -s 10 -O -S -I \ > 1000 /mnt/a1/aiostress.15188.4 /mnt/a1/aiostress.15188.4.20 \ > /mnt/a1/aiostress.15188.4.19 /mnt/a1/aiostress.15188.4.18 \ > [<ffffffff810af447>] futex_wait_queue_me+0xc7/0x100 > [<ffffffff810affa1>] futex_wait+0x191/0x280 > [<ffffffff810b1cb6>] do_futex+0xd6/0xbd0 > [<ffffffff810b282b>] sys_futex+0x7b/0x180 > [<ffffffff8195fe99>] system_call_fastpath+0x16/0x1bHmmm, I wonder if something else in rc1 is causing this? -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba
2013-Jan-04 13:45 UTC
Re: [bug] csum mismatches and failed xfstests with 3.8-rc1
On Fri, Jan 04, 2013 at 08:01:44AM -0500, Chris Mason wrote:> > I''m not quite sure if the messages match the test (best guess, neighbouring > > tests were fine): > > > > [102885.667444] btrfs csum failed ino 638 off 425984 csum 1842675109 private 2279232751 > > [102885.676804] btrfs csum failed ino 638 off 430080 csum 1842675109 private 1192041375 > > [102885.686094] btrfs csum failed ino 638 off 434176 csum 2297282744 private 1619428542 > > [102885.686103] btrfs csum failed ino 638 off 438272 csum 3709984297 private 2868627320 > > [102885.686112] btrfs csum failed ino 638 off 442368 csum 1504116677 private 1239355148 > > [102885.686121] btrfs csum failed ino 638 off 446464 csum 1957839041 private 3848200057 > > [102885.686129] btrfs csum failed ino 638 off 450560 csum 3836729483 private 2867416946 > > I think fsx leaves the bad file, you can test the inode number?# btrfs inspect ino 638 /mnt/a1 /mnt/a1/junk # md5sum /mnt/a1/junk md5sum: /mnt/a1/junk: Input/output error dmesg says: [165711.839959] btrfs csum failed ino 638 off 262144 csum 3105135418 private 2393479556 [165711.899328] btrfs csum failed ino 638 off 262144 csum 2566472073 private 2393479556 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba
2013-Jan-07 17:03 UTC
Re: [bug] csum mismatches and failed xfstests with 3.8-rc1 and rc2
with top commit 5f243b9b46a22e5790dbbc36f574c2417af49a41 (something post -rc2) I see more checksum errors $ dmesg|grep csum|wc -l 100 and they appeared in a period of like last 30 minutes. previous test rounds were clean, and I can see that the same test sequenece run 3 time in a row with the same mount parameters without a mkfs between runs. Test messages seem to be related to no-space conditions. MKFS_OPTIONS -- /dev/sda9 MOUNT_OPTIONS -- -o space_cache,noatime /dev/sda9 /mnt/a2 091 41s ... [17:18:32] [17:18:33] [failed, exit status 1] - output mismatch (see 091.out.bad) --- 091.out 2011-11-01 10:31:12.000000000 +0100 +++ 091.out.bad 2013-01-07 17:18:33.000000000 +0100 @@ -1,7 +1,6 @@ QA output created by 091 fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 32768 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 32768 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W -fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -W +./091: line 46: 11355 Segmentation fault $here/ltp/fsx $args $TEST_DIR/junk >> $seq.full 2>&1 +fsx -N 10000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -R -W +mapped writes DISABLED +truncating to largest ever: 0x12a00 133 220s ... [17:28:51] [17:31:13] - output mismatch (see 133.out.bad) --- 133.out 2011-02-11 11:42:31.000000000 +0100 +++ 133.out.bad 2013-01-07 17:31:13.000000000 +0100 @@ -2,4 +2,5 @@ Buffered writer, buffered reader Direct writer, buffered reader Buffered writer, direct reader +pread64: Input/output error Direct writer, direct reader 240 1s ... [17:40:51] [17:40:51] [failed, exit status 11] - output mismatch (see 240.out.bad) --- 240.out 2011-08-10 17:17:23.000000000 +0200 +++ 240.out.bad 2013-01-07 17:40:51.000000000 +0100 @@ -1,2 +1,4 @@ QA output created by 240 Silence is golden. +AIO write offset 33280 expected 4096 got -5 +short read() at offset 29184 247 41s ... [17:42:16] [17:42:36] [failed, exit status 1] - output mismatch (see 247.out.bad) --- 247.out 2011-08-10 17:17:23.000000000 +0200 +++ 247.out.bad 2013-01-07 17:42:36.000000000 +0100 @@ -1,2 +1,2 @@ QA output created by 247 -Silence is golden. +Bus error 263 237s ... [17:45:00] [17:45:01] [failed, exit status 1] - output mismatch (see 263.out.bad) --- 263.out 2011-11-01 10:31:12.000000000 +0100 +++ 263.out.bad 2013-01-07 17:45:01.000000000 +0100 @@ -1,3 +1,100 @@ QA output created by 263 fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z -fsx -N 10000 -o 128000 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z +fsx -N 10000 -o 8192 -l 500000 -r PSIZE -t BSIZE -w BSIZE -Z +truncating to largest ever: 0x12a00 +truncating to largest ever: 0x75400 +dowrite: write: Input/output error +LOG DUMP (91 total operations): +1( 1 mod 256): SKIPPED (no operation) +2( 2 mod 256): MAPWRITE 0x62600 thru 0x626e8 (0xe9 bytes) +3( 3 mod 256): FALLOC 0x2e0f2 thru 0x2f438 (0x1346 bytes) INTERIOR +4( 4 mod 256): TRUNCATE DOWN from 0x626e9 to 0x12a00 +5( 5 mod 256): MAPREAD 0x0 thru 0x90e (0x90f bytes) +6( 6 mod 256): FALLOC 0x7048 thru 0x80ca (0x1082 bytes) INTERIOR +7( 7 mod 256): WRITE 0x5ea00 thru 0x5f3ff (0xa00 bytes) HOLE +8( 8 mod 256): SKIPPED (no operation) +9( 9 mod 256): FALLOC 0x4957f thru 0x4af1f (0x19a0 bytes) INTERIOR +10( 10 mod 256): SKIPPED (no operation) +11( 11 mod 256): WRITE 0x10a00 thru 0x127ff (0x1e00 bytes) +12( 12 mod 256): MAPWRITE 0x53800 thru 0x54384 (0xb85 bytes) +13( 13 mod 256): MAPWRITE 0x5ae00 thru 0x5c8a9 (0x1aaa bytes) +14( 14 mod 256): READ 0x4a000 thru 0x4afff (0x1000 bytes) +15( 15 mod 256): SKIPPED (no operation) +16( 16 mod 256): MAPREAD 0x36000 thru 0x36acf (0xad0 bytes) +17( 17 mod 256): SKIPPED (no operation) +18( 18 mod 256): READ 0x12000 thru 0x12fff (0x1000 bytes) +19( 19 mod 256): MAPWRITE 0x17600 thru 0x17b42 (0x543 bytes) +20( 20 mod 256): READ 0x29000 thru 0x29fff (0x1000 bytes) +21( 21 mod 256): FALLOC 0xea89 thru 0x1023e (0x17b5 bytes) INTERIOR +22( 22 mod 256): FALLOC 0x569aa thru 0x5847f (0x1ad5 bytes) INTERIOR +23( 23 mod 256): WRITE 0x35c00 thru 0x369ff (0xe00 bytes) +24( 24 mod 256): SKIPPED (no operation) +25( 25 mod 256): SKIPPED (no operation) +26( 26 mod 256): MAPREAD 0x32000 thru 0x32188 (0x189 bytes) +27( 27 mod 256): MAPREAD 0x3f000 thru 0x40e43 (0x1e44 bytes) +28( 28 mod 256): WRITE 0x6f600 thru 0x705ff (0x1000 bytes) HOLE +29( 29 mod 256): MAPREAD 0x5b000 thru 0x5c99e (0x199f bytes) +30( 30 mod 256): TRUNCATE UP from 0x70600 to 0x75400 +31( 31 mod 256): MAPREAD 0x4000 thru 0x54d5 (0x14d6 bytes) +32( 32 mod 256): SKIPPED (no operation) +33( 33 mod 256): FALLOC 0x31d49 thru 0x32872 (0xb29 bytes) INTERIOR +34( 34 mod 256): FALLOC 0x2bbb3 thru 0x2bca7 (0xf4 bytes) INTERIOR +35( 35 mod 256): MAPREAD 0x68000 thru 0x68d8e (0xd8f bytes) +36( 36 mod 256): FALLOC 0x2a075 thru 0x2aed7 (0xe62 bytes) INTERIOR +37( 37 mod 256): MAPWRITE 0x24800 thru 0x25d8f (0x1590 bytes) +38( 38 mod 256): SKIPPED (no operation) +39( 39 mod 256): FALLOC 0x25e59 thru 0x26a93 (0xc3a bytes) INTERIOR +40( 40 mod 256): WRITE 0x1a600 thru 0x1afff (0xa00 bytes) +41( 41 mod 256): SKIPPED (no operation) +42( 42 mod 256): MAPREAD 0x72000 thru 0x729ce (0x9cf bytes) +43( 43 mod 256): MAPREAD 0x4f000 thru 0x4fe79 (0xe7a bytes) +44( 44 mod 256): FALLOC 0x114aa thru 0x12d40 (0x1896 bytes) INTERIOR +45( 45 mod 256): READ 0x1f000 thru 0x1ffff (0x1000 bytes) +46( 46 mod 256): MAPWRITE 0x54600 thru 0x55272 (0xc73 bytes) +47( 47 mod 256): WRITE 0xb600 thru 0xbfff (0xa00 bytes) +48( 48 mod 256): SKIPPED (no operation) +49( 49 mod 256): WRITE 0x78600 thru 0x787ff (0x200 bytes) HOLE +50( 50 mod 256): READ 0x24000 thru 0x24fff (0x1000 bytes) +51( 51 mod 256): WRITE 0x17a00 thru 0x18bff (0x1200 bytes) +52( 52 mod 256): MAPREAD 0x19000 thru 0x19093 (0x94 bytes) +53( 53 mod 256): SKIPPED (no operation) +54( 54 mod 256): WRITE 0x1a600 thru 0x1b3ff (0xe00 bytes) +55( 55 mod 256): SKIPPED (no operation) +56( 56 mod 256): READ 0x2e000 thru 0x2efff (0x1000 bytes) +57( 57 mod 256): FALLOC 0x6f9e9 thru 0x7172c (0x1d43 bytes) INTERIOR +58( 58 mod 256): SKIPPED (no operation) +59( 59 mod 256): SKIPPED (no operation) +60( 60 mod 256): FALLOC 0x38a61 thru 0x3a6d2 (0x1c71 bytes) INTERIOR +61( 61 mod 256): SKIPPED (no operation) +62( 62 mod 256): SKIPPED (no operation) +63( 63 mod 256): WRITE 0x10800 thru 0x11dff (0x1600 bytes) +64( 64 mod 256): WRITE 0x65000 thru 0x655ff (0x600 bytes) +65( 65 mod 256): TRUNCATE DOWN from 0x78800 to 0xb200 +66( 66 mod 256): SKIPPED (no operation) +67( 67 mod 256): MAPWRITE 0x72200 thru 0x73d0d (0x1b0e bytes) +68( 68 mod 256): MAPWRITE 0x21200 thru 0x21f14 (0xd15 bytes) +69( 69 mod 256): SKIPPED (no operation) +70( 70 mod 256): SKIPPED (no operation) +71( 71 mod 256): MAPWRITE 0x6c000 thru 0x6db45 (0x1b46 bytes) +72( 72 mod 256): READ 0x3d000 thru 0x3dfff (0x1000 bytes) +73( 73 mod 256): READ 0x34000 thru 0x34fff (0x1000 bytes) +74( 74 mod 256): MAPREAD 0x20000 thru 0x218f5 (0x18f6 bytes) +75( 75 mod 256): MAPREAD 0x29000 thru 0x2a1a6 (0x11a7 bytes) +76( 76 mod 256): READ 0x62000 thru 0x62fff (0x1000 bytes) +77( 77 mod 256): SKIPPED (no operation) +78( 78 mod 256): MAPWRITE 0x2be00 thru 0x2c035 (0x236 bytes) +79( 79 mod 256): TRUNCATE DOWN from 0x73d0e to 0x14e00 +80( 80 mod 256): MAPWRITE 0x33800 thru 0x355a6 (0x1da7 bytes) +81( 81 mod 256): SKIPPED (no operation) +82( 82 mod 256): MAPREAD 0x2000 thru 0x3e54 (0x1e55 bytes) +83( 83 mod 256): MAPREAD 0x5000 thru 0x6c7f (0x1c80 bytes) +84( 84 mod 256): MAPREAD 0x8000 thru 0x9845 (0x1846 bytes) +85( 85 mod 256): SKIPPED (no operation) +86( 86 mod 256): FALLOC 0x3eb18 thru 0x3fd14 (0x11fc bytes) INTERIOR +87( 87 mod 256): SKIPPED (no operation) +88( 88 mod 256): WRITE 0x21600 thru 0x21fff (0xa00 bytes) +89( 89 mod 256): WRITE 0x10200 thru 0x105ff (0x400 bytes) +90( 90 mod 256): WRITE 0x2be00 thru 0x2c5ff (0x800 bytes) +91( 91 mod 256): WRITE 0x2b200 thru 0x2c3ff (0x1200 bytes) +Correct content saved for comparison +(maybe hexdump "/mnt/a1/junk" vs "/mnt/a1/junk.fsxgood") david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba
2013-Jan-07 17:06 UTC
Re: [bug] csum mismatches and failed xfstests with 3.8-rc1 and rc2
On Mon, Jan 07, 2013 at 06:03:51PM +0100, David Sterba wrote:> with top commit 5f243b9b46a22e5790dbbc36f574c2417af49a41 (something post > -rc2) I see more checksum errors > > $ dmesg|grep csum|wc -l > 100more of dmesg: [15303.739076] btrfs csum failed ino 63791 off 368640 csum 3994424334 private 783210346 [15303.748113] ------------[ cut here ]------------ [15303.752052] kernel BUG at mm/page-writeback.c:2164! [15303.752052] invalid opcode: 0000 [#1] SMP [15303.752052] Modules linked in: dm_crypt loop btrfs [15303.752052] CPU 0 [15303.752052] Pid: 11355, comm: fsx Not tainted 3.8.0-rc2-default+ #228 Intel Corporation Santa Rosa platform/Matanzas [15303.752052] RIP: 0010:[<ffffffff81118239>] [<ffffffff81118239>] clear_page_dirty_for_io+0x119/0x130 [15303.752052] RSP: 0018:ffff88002024fb28 EFLAGS: 00010246 [15303.752052] RAX: 4000000000000802 RBX: ffffea00013c68f0 RCX: 0000000000000000 [15303.752052] RDX: 0000000000000011 RSI: ffffffff81151320 RDI: ffffea00013c68f0 [15303.752052] RBP: ffff88002024fb38 R08: 0000000000000000 R09: ffff88004fa65640 [15303.752052] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88006dd6e780 [15303.752052] R13: 0000000000000000 R14: ffff880063f4fac0 R15: 0000000000000001 [15303.752052] FS: 00007f7067547700(0000) GS:ffff88007d800000(0000) knlGS:0000000000000000 [15303.752052] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [15303.752052] CR2: 00007f7066d2cb10 CR3: 000000001dafe000 CR4: 00000000000007f0 [15303.752052] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [15303.752052] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [15303.752052] Process fsx (pid: 11355, threadinfo ffff88002024e000, task ffff880058334440) [15303.752052] Stack: [15303.752052] ffff880063f4fac0 0000000000000001 ffff88002024fc08 ffffffffa0044092 [15303.752052] ffff88002024fbd0 ffff880000000050 0000000000022000 00001000819572cb [15303.752052] 0000000000000000 000000000005b000 00000000797903b8 000000000005ae00 [15303.752052] Call Trace: [15303.752052] [<ffffffffa0044092>] prepare_pages+0x302/0x3a0 [btrfs] [15303.752052] [<ffffffffa00455ed>] __btrfs_buffered_write+0x1ad/0x370 [btrfs] [15303.752052] [<ffffffffa0045cab>] btrfs_file_aio_write+0x4fb/0x5e0 [btrfs] [15303.752052] [<ffffffff8115e04a>] do_sync_write+0xaa/0xf0 [15303.752052] [<ffffffff8115e74e>] vfs_write+0xce/0x190 [15303.752052] [<ffffffff8115eab2>] sys_write+0x62/0xb0 [15303.752052] [<ffffffff8139a2de>] ? trace_hardirqs_on_thunk+0x3a/0x3f [15303.752052] [<ffffffff81960699>] system_call_fastpath+0x16/0x1b [15303.752052] Code: 00 00 48 89 df e8 38 fb ff ff e9 73 ff ff ff 0f 1f 00 48 89 df 57 9d 66 66 90 66 90 e8 51 e5 f8 ff b8 01 00 00 00 e9 3a ff ff ff <0f> 0b 49 c7 c4 20 7f e4 81 e9 07 ff ff ff 66 0f 1f 84 00 00 00 [15303.752052] RIP [<ffffffff81118239>] clear_page_dirty_for_io+0x119/0x130 [15303.752052] RSP <ffff88002024fb28> [15304.049369] ---[ end trace e69c5ab147a5d842 ]--- [15304.358163] ------------[ cut here ]------------ [15304.364297] WARNING: at fs/btrfs/inode.c:7123 btrfs_destroy_inode+0x23e/0x2e0 [btrfs]() [15304.373804] Hardware name: Santa Rosa platform [15304.373809] Modules linked in: dm_crypt loop btrfs [15304.373813] Pid: 11387, comm: umount Tainted: G D 3.8.0-rc2-default+ #228 [15304.373814] Call Trace: [15304.373822] [<ffffffff8104c6bf>] warn_slowpath_common+0x7f/0xc0 [15304.373831] [<ffffffff8104c71a>] warn_slowpath_null+0x1a/0x20 [15304.373851] [<ffffffffa00435be>] btrfs_destroy_inode+0x23e/0x2e0 [btrfs] [15304.373857] [<ffffffff81178a4c>] destroy_inode+0x3c/0x70 [15304.373862] [<ffffffff81178ba6>] evict+0x126/0x1c0 [15304.373867] [<ffffffff81178c8f>] dispose_list+0x4f/0x60 [15304.373873] [<ffffffff811799f4>] evict_inodes+0x114/0x130 [15304.373879] [<ffffffff81160903>] generic_shutdown_super+0x53/0xf0 [15304.373886] [<ffffffff81160a36>] kill_anon_super+0x16/0x30 [15304.373898] [<ffffffffa0004d9a>] btrfs_kill_super+0x1a/0x90 [btrfs] [15304.373903] [<ffffffff81161902>] ? deactivate_super+0x42/0x70 [15304.373909] [<ffffffff81160c8d>] deactivate_locked_super+0x3d/0x90 [15304.373914] [<ffffffff8116190a>] deactivate_super+0x4a/0x70 [15304.373920] [<ffffffff8117dc90>] mntput_no_expire+0x100/0x160 [15304.373925] [<ffffffff8117ecd1>] sys_umount+0x71/0x3c0 [15304.373931] [<ffffffff81960699>] system_call_fastpath+0x16/0x1b [15304.373934] ---[ end trace e69c5ab147a5d843 ]--- [15304.373937] ------------[ cut here ]------------ [15304.373954] WARNING: at fs/btrfs/inode.c:7124 btrfs_destroy_inode+0x2c6/0x2e0 [btrfs]() [15304.373955] Hardware name: Santa Rosa platform [15304.373963] Modules linked in: dm_crypt loop btrfs [15304.373968] Pid: 11387, comm: umount Tainted: G D W 3.8.0-rc2-default+ #228 [15304.373969] Call Trace: [15304.373974] [<ffffffff8104c6bf>] warn_slowpath_common+0x7f/0xc0 [15304.373979] [<ffffffff8104c71a>] warn_slowpath_null+0x1a/0x20 [15304.373998] [<ffffffffa0043646>] btrfs_destroy_inode+0x2c6/0x2e0 [btrfs] [15304.374003] [<ffffffff81178a4c>] destroy_inode+0x3c/0x70 [15304.374008] [<ffffffff81178ba6>] evict+0x126/0x1c0 [15304.374012] [<ffffffff81178c8f>] dispose_list+0x4f/0x60 [15304.374017] [<ffffffff811799f4>] evict_inodes+0x114/0x130 [15304.374022] [<ffffffff81160903>] generic_shutdown_super+0x53/0xf0 [15304.374027] [<ffffffff81160a36>] kill_anon_super+0x16/0x30 [15304.374039] [<ffffffffa0004d9a>] btrfs_kill_super+0x1a/0x90 [btrfs] [15304.374044] [<ffffffff81161902>] ? deactivate_super+0x42/0x70 [15304.374049] [<ffffffff81160c8d>] deactivate_locked_super+0x3d/0x90 [15304.374054] [<ffffffff8116190a>] deactivate_super+0x4a/0x70 [15304.374059] [<ffffffff8117dc90>] mntput_no_expire+0x100/0x160 [15304.374064] [<ffffffff8117ecd1>] sys_umount+0x71/0x3c0 [15304.374069] [<ffffffff81960699>] system_call_fastpath+0x16/0x1b [15304.374072] ---[ end trace e69c5ab147a5d844 ]--- [15304.374075] ------------[ cut here ]------------ [15304.374092] WARNING: at fs/btrfs/inode.c:7126 btrfs_destroy_inode+0x29a/0x2e0 [btrfs]() [15304.374095] Hardware name: Santa Rosa platform [15304.374100] Modules linked in: dm_crypt loop btrfs [15304.374104] Pid: 11387, comm: umount Tainted: G D W 3.8.0-rc2-default+ #228 [15304.374105] Call Trace: [15304.374110] [<ffffffff8104c6bf>] warn_slowpath_common+0x7f/0xc0 [15304.374115] [<ffffffff8104c71a>] warn_slowpath_null+0x1a/0x20 [15304.374134] [<ffffffffa004361a>] btrfs_destroy_inode+0x29a/0x2e0 [btrfs] [15304.374139] [<ffffffff81178a4c>] destroy_inode+0x3c/0x70 [15304.374144] [<ffffffff81178ba6>] evict+0x126/0x1c0 [15304.374149] [<ffffffff81178c8f>] dispose_list+0x4f/0x60 [15304.374153] [<ffffffff811799f4>] evict_inodes+0x114/0x130 [15304.374159] [<ffffffff81160903>] generic_shutdown_super+0x53/0xf0 [15304.374163] [<ffffffff81160a36>] kill_anon_super+0x16/0x30 [15304.374177] [<ffffffffa0004d9a>] btrfs_kill_super+0x1a/0x90 [btrfs] [15304.374182] [<ffffffff81161902>] ? deactivate_super+0x42/0x70 [15304.374187] [<ffffffff81160c8d>] deactivate_locked_super+0x3d/0x90 [15304.374192] [<ffffffff8116190a>] deactivate_super+0x4a/0x70 [15304.374197] [<ffffffff8117dc90>] mntput_no_expire+0x100/0x160 [15304.374202] [<ffffffff8117ecd1>] sys_umount+0x71/0x3c0 [15304.374207] [<ffffffff81960699>] system_call_fastpath+0x16/0x1b [15304.374210] ---[ end trace e69c5ab147a5d845 ]--- [15304.829781] ------------[ cut here ]------------ [15304.835437] WARNING: at fs/btrfs/extent-tree.c:4366 btrfs_free_block_groups+0x2cc/0x370 [btrfs]() [15304.845358] Hardware name: Santa Rosa platform [15304.845362] Modules linked in: dm_crypt loop btrfs [15304.845365] Pid: 11387, comm: umount Tainted: G D W 3.8.0-rc2-default+ #228 [15304.845366] Call Trace: [15304.845373] [<ffffffff8104c6bf>] warn_slowpath_common+0x7f/0xc0 [15304.845377] [<ffffffff8104c71a>] warn_slowpath_null+0x1a/0x20 [15304.845389] [<ffffffffa002261c>] btrfs_free_block_groups+0x2cc/0x370 [btrfs] [15304.845402] [<ffffffffa002fc5a>] close_ctree+0x1ca/0x340 [btrfs] [15304.845409] [<ffffffff81178c8f>] ? dispose_list+0x4f/0x60 [15304.845412] [<ffffffff811799f4>] ? evict_inodes+0x114/0x130 [15304.845422] [<ffffffffa0003c69>] btrfs_put_super+0x19/0x20 [btrfs] [15304.845427] [<ffffffff81160912>] generic_shutdown_super+0x62/0xf0 [15304.845431] [<ffffffff81160a36>] kill_anon_super+0x16/0x30 [15304.845441] [<ffffffffa0004d9a>] btrfs_kill_super+0x1a/0x90 [btrfs] [15304.845445] [<ffffffff81161902>] ? deactivate_super+0x42/0x70 [15304.845449] [<ffffffff81160c8d>] deactivate_locked_super+0x3d/0x90 [15304.845453] [<ffffffff8116190a>] deactivate_super+0x4a/0x70 [15304.845458] [<ffffffff8117dc90>] mntput_no_expire+0x100/0x160 [15304.845462] [<ffffffff8117ecd1>] sys_umount+0x71/0x3c0 [15304.845467] [<ffffffff81960699>] system_call_fastpath+0x16/0x1b [15304.845469] ---[ end trace e69c5ab147a5d846 ]--- [15304.845471] ------------[ cut here ]------------ [15304.845481] WARNING: at fs/btrfs/extent-tree.c:4367 btrfs_free_block_groups+0x366/0x370 [btrfs]() [15304.845482] Hardware name: Santa Rosa platform [15304.845488] Modules linked in: dm_crypt loop btrfs [15304.845490] Pid: 11387, comm: umount Tainted: G D W 3.8.0-rc2-default+ #228 [15304.845491] Call Trace: [15304.845496] [<ffffffff8104c6bf>] warn_slowpath_common+0x7f/0xc0 [15304.845499] [<ffffffff8104c71a>] warn_slowpath_null+0x1a/0x20 [15304.845511] [<ffffffffa00226b6>] btrfs_free_block_groups+0x366/0x370 [btrfs] [15304.845524] [<ffffffffa002fc5a>] close_ctree+0x1ca/0x340 [btrfs] [15304.845528] [<ffffffff81178c8f>] ? dispose_list+0x4f/0x60 [15304.845532] [<ffffffff811799f4>] ? evict_inodes+0x114/0x130 [15304.845542] [<ffffffffa0003c69>] btrfs_put_super+0x19/0x20 [btrfs] [15304.845547] [<ffffffff81160912>] generic_shutdown_super+0x62/0xf0 [15304.845551] [<ffffffff81160a36>] kill_anon_super+0x16/0x30 [15304.845560] [<ffffffffa0004d9a>] btrfs_kill_super+0x1a/0x90 [btrfs] [15304.845564] [<ffffffff81161902>] ? deactivate_super+0x42/0x70 [15304.845568] [<ffffffff81160c8d>] deactivate_locked_super+0x3d/0x90 [15304.845572] [<ffffffff8116190a>] deactivate_super+0x4a/0x70 [15304.845577] [<ffffffff8117dc90>] mntput_no_expire+0x100/0x160 [15304.845580] [<ffffffff8117ecd1>] sys_umount+0x71/0x3c0 [15304.845584] [<ffffffff81960699>] system_call_fastpath+0x16/0x1b [15304.845587] ---[ end trace e69c5ab147a5d847 ]--- [15304.845589] ------------[ cut here ]------------ [15304.845599] WARNING: at fs/btrfs/extent-tree.c:7657 btrfs_free_block_groups+0x25b/0x370 [btrfs]() [15304.845600] Hardware name: Santa Rosa platform [15304.845603] Modules linked in: dm_crypt loop btrfs [15304.845606] Pid: 11387, comm: umount Tainted: G D W 3.8.0-rc2-default+ #228 [15304.845607] Call Trace: [15304.845611] [<ffffffff8104c6bf>] warn_slowpath_common+0x7f/0xc0 [15304.845614] [<ffffffff8104c71a>] warn_slowpath_null+0x1a/0x20 [15304.845625] [<ffffffffa00225ab>] btrfs_free_block_groups+0x25b/0x370 [btrfs] [15304.845639] [<ffffffffa002fc5a>] close_ctree+0x1ca/0x340 [btrfs] [15304.845643] [<ffffffff81178c8f>] ? dispose_list+0x4f/0x60 [15304.845647] [<ffffffff811799f4>] ? evict_inodes+0x114/0x130 [15304.845656] [<ffffffffa0003c69>] btrfs_put_super+0x19/0x20 [btrfs] [15304.845660] [<ffffffff81160912>] generic_shutdown_super+0x62/0xf0 [15304.845664] [<ffffffff81160a36>] kill_anon_super+0x16/0x30 [15304.845673] [<ffffffffa0004d9a>] btrfs_kill_super+0x1a/0x90 [btrfs] [15304.845677] [<ffffffff81161902>] ? deactivate_super+0x42/0x70 [15304.845681] [<ffffffff81160c8d>] deactivate_locked_super+0x3d/0x90 [15304.845685] [<ffffffff8116190a>] deactivate_super+0x4a/0x70 [15304.845689] [<ffffffff8117dc90>] mntput_no_expire+0x100/0x160 [15304.845693] [<ffffffff8117ecd1>] sys_umount+0x71/0x3c0 [15304.845697] [<ffffffff81960699>] system_call_fastpath+0x16/0x1b [15304.845700] ---[ end trace e69c5ab147a5d848 ]--- [15304.845703] space_info 1 has 4277313536 free, is not full [15304.845705] space_info total=4303355904, used=25976832, pinned=0, reserved=0, may_use=4096, readonly=65536 [15304.845707] ------------[ cut here ]------------ [15304.845717] WARNING: at fs/btrfs/extent-tree.c:7657 btrfs_free_block_groups+0x25b/0x370 [btrfs]() [15304.845718] Hardware name: Santa Rosa platform [15304.845722] Modules linked in: dm_crypt loop btrfs [15304.845725] Pid: 11387, comm: umount Tainted: G D W 3.8.0-rc2-default+ #228 [15304.845725] Call Trace: [15304.845728] [<ffffffff8104c6bf>] warn_slowpath_common+0x7f/0xc0 [15304.845731] [<ffffffff8104c71a>] warn_slowpath_null+0x1a/0x20 [15304.845741] [<ffffffffa00225ab>] btrfs_free_block_groups+0x25b/0x370 [btrfs] [15304.845753] [<ffffffffa002fc5a>] close_ctree+0x1ca/0x340 [btrfs] [15304.845755] [<ffffffff81178c8f>] ? dispose_list+0x4f/0x60 [15304.845757] [<ffffffff811799f4>] ? evict_inodes+0x114/0x130 [15304.845765] [<ffffffffa0003c69>] btrfs_put_super+0x19/0x20 [btrfs] [15304.845768] [<ffffffff81160912>] generic_shutdown_super+0x62/0xf0 [15304.845770] [<ffffffff81160a36>] kill_anon_super+0x16/0x30 [15304.845778] [<ffffffffa0004d9a>] btrfs_kill_super+0x1a/0x90 [btrfs] [15304.845780] [<ffffffff81161902>] ? deactivate_super+0x42/0x70 [15304.845782] [<ffffffff81160c8d>] deactivate_locked_super+0x3d/0x90 [15304.845785] [<ffffffff8116190a>] deactivate_super+0x4a/0x70 [15304.845787] [<ffffffff8117dc90>] mntput_no_expire+0x100/0x160 [15304.845790] [<ffffffff8117ecd1>] sys_umount+0x71/0x3c0 [15304.845792] [<ffffffff81960699>] system_call_fastpath+0x16/0x1b [15304.845793] ---[ end trace e69c5ab147a5d849 ]--- [15304.845795] space_info 4 has 1072345088 free, is not full [15304.845797] space_info total=1082130432, used=1265664, pinned=0, reserved=0, may_use=294912, readonly=8519680 [15306.426312] btrfs: disk space caching is enabled [15694.784899] btrfs csum failed ino 66356 off 8888320 csum 2171778039 private 2566472073 [15694.794201] btrfs csum failed ino 66356 off 8908800 csum 3905765731 private 2566472073 [15921.985270] btrfs: disk space caching is enabled [15940.469676] btrfs csum failed ino 66350 off 589824 csum 2566472073 private 628520344 [15940.479220] btrfs csum failed ino 66350 off 786432 csum 2566472073 private 628520344 [15940.603713] btrfs csum failed ino 66350 off 11665408 csum 2566472073 private 628520344 [15940.613342] btrfs csum failed ino 66350 off 12058624 csum 2566472073 private 628520344 [15940.623017] btrfs csum failed ino 66350 off 12451840 csum 2566472073 private 628520344 [15940.643849] btrfs csum failed ino 66350 off 12845056 csum 2566472073 private 628520344 [15945.783819] btrfs csum failed ino 66350 off 536662016 csum 2566472073 private 628520344 [15970.955754] btrfs csum failed ino 66351 off 5767168 csum 108545150 private 628520344 [15973.188458] btrfs csum failed ino 66351 off 95813632 csum 2566472073 private 628520344 [15973.325945] btrfs csum failed ino 66351 off 96206848 csum 2566472073 private 628520344 [15973.335697] btrfs csum failed ino 66351 off 96600064 csum 2566472073 private 628520344 [15973.393742] btrfs csum failed ino 66351 off 101122048 csum 2566472073 private 628520344 [15973.836510] btrfs csum failed ino 66351 off 101515264 csum 2566472073 private 628520344 [15973.846280] btrfs csum failed ino 66351 off 101908480 csum 2566472073 private 628520344 [16016.241905] btrfs csum failed ino 66352 off 48234496 csum 2566472073 private 628520344 [16016.251038] btrfs csum failed ino 66352 off 48238592 csum 2566472073 private 628520344 [16016.251043] btrfs csum failed ino 66352 off 48242688 csum 2566472073 private 628520344 [16016.251049] btrfs csum failed ino 66352 off 48246784 csum 2566472073 private 628520344 [16016.251054] btrfs csum failed ino 66352 off 48250880 csum 2566472073 private 628520344 [16016.251059] btrfs csum failed ino 66352 off 48254976 csum 2566472073 private 628520344 [16016.251063] btrfs csum failed ino 66352 off 48259072 csum 2566472073 private 628520344 [16016.251068] btrfs csum failed ino 66352 off 48263168 csum 2566472073 private 628520344 [16016.251072] btrfs csum failed ino 66352 off 48267264 csum 2566472073 private 628520344 [16016.251077] btrfs csum failed ino 66352 off 48271360 csum 2566472073 private 628520344 [16016.251081] btrfs csum failed ino 66352 off 48275456 csum 2566472073 private 628520344 [16016.251086] btrfs csum failed ino 66352 off 48279552 csum 2566472073 private 628520344 [16016.251091] btrfs csum failed ino 66352 off 48283648 csum 2566472073 private 628520344 [16016.251095] btrfs csum failed ino 66352 off 48287744 csum 2566472073 private 628520344 [16016.251100] btrfs csum failed ino 66352 off 48291840 csum 2566472073 private 628520344 [16016.251104] btrfs csum failed ino 66352 off 48295936 csum 2566472073 private 628520344 [16365.034223] btrfs: disk space caching is enabled [16365.660772] btrfs csum failed ino 66352 off 122880 csum 2962417916 private 1108896639 [16365.674192] btrfs csum failed ino 66352 off 57344 csum 2566472073 private 1108896639 [16366.378533] btrfs csum failed ino 66352 off 577536 csum 2272747479 private 1108896639 [16366.427623] btrfs csum failed ino 66352 off 970752 csum 2272747479 private 1108896639 [16366.636736] btrfs csum failed ino 66352 off 5062656 csum 2962417916 private 1108896639 [16366.656774] btrfs csum failed ino 66352 off 5652480 csum 2962417916 private 1108896639 [16366.666961] btrfs csum failed ino 66352 off 6045696 csum 2962417916 private 1108896639 [16366.805461] btrfs csum failed ino 66352 off 6504448 csum 2962417916 private 1108896639 [16366.815044] btrfs csum failed ino 66352 off 6701056 csum 2962417916 private 1108896639 [16366.954780] btrfs csum failed ino 66352 off 7032832 csum 2962417916 private 1108896639 [16370.863641] btrfs_readpage_end_io_hook: 57 callbacks suppressed [16370.870674] btrfs csum failed ino 66352 off 671744 csum 628520344 private 4063643746 [16370.879802] btrfs csum failed ino 66352 off 868352 csum 628520344 private 4063643746 [16371.047197] btrfs csum failed ino 66352 off 1458176 csum 628520344 private 4063643746 [16371.067697] btrfs csum failed ino 66352 off 1851392 csum 628520344 private 4063643746 [16371.229767] btrfs csum failed ino 66352 off 2433024 csum 628520344 private 4063643746 [16371.262798] btrfs csum failed ino 66352 off 2572288 csum 2566472073 private 4063643746 [16371.320675] btrfs csum failed ino 66352 off 2834432 csum 628520344 private 4063643746 [16371.411517] btrfs csum failed ino 66352 off 2912256 csum 628520344 private 4063643746 [16371.433164] btrfs csum failed ino 66352 off 2965504 csum 628520344 private 4063643746 [16371.559819] btrfs csum failed ino 66352 off 3227648 csum 628520344 private 4063643746 [16375.949306] btrfs_readpage_end_io_hook: 67 callbacks suppressed [16375.956367] btrfs csum failed ino 66352 off 278528 csum 628520344 private 665344948 [16376.052581] btrfs csum failed ino 66352 off 356352 csum 628520344 private 665344948 [16376.073033] btrfs csum failed ino 66352 off 409600 csum 2566472073 private 665344948 [16376.120465] btrfs csum failed ino 66352 off 606208 csum 628520344 private 665344948 [16376.155395] btrfs csum failed ino 66352 off 671744 csum 2566472073 private 665344948 [16376.372957] btrfs csum failed ino 66352 off 999424 csum 628520344 private 665344948 [16376.382185] btrfs csum failed ino 66352 off 1196032 csum 628520344 private 665344948 [16376.409080] btrfs csum failed ino 66352 off 1589248 csum 628520344 private 665344948 [16376.521638] btrfs csum failed ino 66352 off 1785856 csum 628520344 private 665344948 [16376.570192] btrfs csum failed ino 66352 off 1982464 csum 2566472073 private 665344948 [16381.059067] btrfs_readpage_end_io_hook: 71 callbacks suppressed [16381.066104] btrfs csum failed ino 66352 off 5193728 csum 628520344 private 1298616927 [16381.153326] btrfs csum failed ino 66352 off 5390336 csum 2566472073 private 1298616927 [16381.166398] btrfs csum failed ino 66352 off 5455872 csum 628520344 private 1298616927 [16381.294876] btrfs csum failed ino 66352 off 5537792 csum 628520344 private 1298616927 [16381.314543] btrfs csum failed ino 66352 off 5586944 csum 628520344 private 1298616927 [16381.371511] btrfs csum failed ino 66352 off 5783552 csum 628520344 private 1298616927 [16381.388887] btrfs csum failed ino 66352 off 5849088 csum 628520344 private 1298616927 [16381.422504] btrfs csum failed ino 66352 off 5914624 csum 2566472073 private 1298616927 [16381.521612] btrfs csum failed ino 66352 off 6045696 csum 628520344 private 1298616927 [16381.540930] btrfs csum failed ino 66352 off 6111232 csum 628520344 private 1298616927 [16386.174164] btrfs_readpage_end_io_hook: 65 callbacks suppressed [16386.181196] btrfs csum failed ino 66352 off 8019968 csum 2566472073 private 2542172329 [16386.193769] btrfs csum failed ino 66352 off 8077312 csum 628520344 private 2542172329 [16386.251784] btrfs csum failed ino 66352 off 8339456 csum 628520344 private 2542172329 [16386.260941] btrfs csum failed ino 66352 off 8273920 csum 2566472073 private 2542172329 [16386.385256] btrfs csum failed ino 66352 off 147456 csum 2566472073 private 1402544159 [16386.472192] btrfs csum failed ino 66352 off 278528 csum 628520344 private 1402544159 [16386.585245] btrfs csum failed ino 66352 off 409600 csum 2566472073 private 1402544159 [16386.622313] btrfs csum failed ino 66352 off 475136 csum 628520344 private 1402544159 [16386.666599] btrfs csum failed ino 66352 off 606208 csum 628520344 private 1402544159 [16386.780363] btrfs csum failed ino 66352 off 737280 csum 628520344 private 1402544159 [16391.201723] btrfs_readpage_end_io_hook: 64 callbacks suppressed [16391.208744] btrfs csum failed ino 66352 off 2441216 csum 628520344 private 2306345705 [16391.217975] btrfs csum failed ino 66352 off 2506752 csum 2566472073 private 2306345705 [16391.399303] btrfs csum failed ino 66352 off 2834432 csum 628520344 private 2306345705 [16391.408707] btrfs csum failed ino 66352 off 3031040 csum 628520344 private 2306345705 [16391.591001] btrfs csum failed ino 66352 off 3649536 csum 628520344 private 2306345705 [16391.642713] btrfs csum failed ino 66352 off 4046848 csum 628520344 private 2306345705 [16391.754148] btrfs csum failed ino 66352 off 4571136 csum 628520344 private 2306345705 [16391.889749] btrfs csum failed ino 66352 off 4997120 csum 628520344 private 2306345705 [16391.932588] btrfs csum failed ino 66352 off 5062656 csum 628520344 private 2306345705 [16392.044805] btrfs csum failed ino 66352 off 5390336 csum 628520344 private 2306345705 [16398.159714] btrfs: disk space caching is enabled [16524.354181] ------------[ cut here ]------------ [16524.360805] WARNING: at fs/btrfs/delayed-inode.c:703 btrfs_delayed_update_inode+0x675/0x6b0 [btrfs]() [16524.372065] Hardware name: Santa Rosa platform [16524.372072] Modules linked in: dm_crypt loop btrfs [16524.372077] Pid: 27886, comm: btrfs-endio-wri Tainted: G D W 3.8.0-rc2-default+ #228 [16524.372078] Call Trace: [16524.372089] [<ffffffff8104c6bf>] warn_slowpath_common+0x7f/0xc0 [16524.372094] [<ffffffff8104c71a>] warn_slowpath_null+0x1a/0x20 [16524.372121] [<ffffffffa00847d5>] btrfs_delayed_update_inode+0x675/0x6b0 [btrfs] [16524.372148] [<ffffffffa0025d65>] ? btrfs_update_root_times+0x75/0x90 [btrfs] [16524.372173] [<ffffffffa003a85a>] btrfs_update_inode+0x6a/0x100 [btrfs] [16524.372198] [<ffffffffa003d637>] btrfs_update_inode_fallback+0x27/0x60 [btrfs] [16524.372223] [<ffffffffa003d920>] btrfs_finish_ordered_io+0x2b0/0x400 [btrfs] [16524.372231] [<ffffffff81957320>] ? _raw_spin_unlock_irq+0x30/0x50 [16524.372258] [<ffffffffa003da85>] finish_ordered_fn+0x15/0x20 [btrfs] [16524.372284] [<ffffffffa00603e4>] worker_loop+0xc4/0x5a0 [btrfs] [16524.372312] [<ffffffffa0060320>] ? btrfs_queue_worker+0x330/0x330 [btrfs] [16524.372321] [<ffffffff81073d4e>] kthread+0xde/0xf0 [16524.372330] [<ffffffff8107fbe6>] ? finish_task_switch+0x46/0xf0 [16524.372338] [<ffffffff81073c70>] ? flush_kthread_worker+0x1e0/0x1e0 [16524.372346] [<ffffffff819605ec>] ret_from_fork+0x7c/0xb0 [16524.372352] [<ffffffff81073c70>] ? flush_kthread_worker+0x1e0/0x1e0 [16524.372357] ---[ end trace e69c5ab147a5d84a ]--- [16524.695745] ------------[ cut here ]------------ [16524.702233] WARNING: at fs/btrfs/delayed-inode.c:703 btrfs_delayed_update_inode+0x675/0x6b0 [btrfs]() [16524.713324] Hardware name: Santa Rosa platform [16524.713331] Modules linked in: dm_crypt loop btrfs [16524.713336] Pid: 27886, comm: btrfs-endio-wri Tainted: G D W 3.8.0-rc2-default+ #228 [16524.713337] Call Trace: [16524.713347] [<ffffffff8104c6bf>] warn_slowpath_common+0x7f/0xc0 [16524.713352] [<ffffffff8104c71a>] warn_slowpath_null+0x1a/0x20 [16524.713377] [<ffffffffa00847d5>] btrfs_delayed_update_inode+0x675/0x6b0 [btrfs] [16524.713401] [<ffffffffa0025d65>] ? btrfs_update_root_times+0x75/0x90 [btrfs] [16524.713427] [<ffffffffa003a85a>] btrfs_update_inode+0x6a/0x100 [btrfs] [16524.713452] [<ffffffffa003d637>] btrfs_update_inode_fallback+0x27/0x60 [btrfs] [16524.713477] [<ffffffffa003d920>] btrfs_finish_ordered_io+0x2b0/0x400 [btrfs] [16524.713486] [<ffffffff81957320>] ? _raw_spin_unlock_irq+0x30/0x50 [16524.713511] [<ffffffffa003da85>] finish_ordered_fn+0x15/0x20 [btrfs] [16524.713539] [<ffffffffa00603e4>] worker_loop+0xc4/0x5a0 [btrfs] [16524.713566] [<ffffffffa0060320>] ? btrfs_queue_worker+0x330/0x330 [btrfs] [16524.713574] [<ffffffff81073d4e>] kthread+0xde/0xf0 [16524.713583] [<ffffffff8107fbe6>] ? finish_task_switch+0x46/0xf0 [16524.713590] [<ffffffff81073c70>] ? flush_kthread_worker+0x1e0/0x1e0 [16524.713596] [<ffffffff819605ec>] ret_from_fork+0x7c/0xb0 [16524.713602] [<ffffffff81073c70>] ? flush_kthread_worker+0x1e0/0x1e0 [16524.713607] ---[ end trace e69c5ab147a5d84b ]--- [16589.879707] btrfs: disk space caching is enabled [16642.502389] btrfs_readpage_end_io_hook: 54 callbacks suppressed [16642.509559] btrfs csum failed ino 66363 off 32768 csum 2486649235 private 3323905731 [16642.520865] btrfs csum failed ino 66363 off 32768 csum 2486649235 private 3323905731 [16644.572295] btrfs: disk space caching is enabled [16743.714911] btrfs csum failed ino 66372 off 113119232 csum 2279583572 private 2566472073 [16743.724717] btrfs csum failed ino 66372 off 113512448 csum 2279583572 private 2566472073 [16743.734544] btrfs csum failed ino 66372 off 113905664 csum 2279583572 private 2566472073 [16743.775876] btrfs csum failed ino 66372 off 113573888 csum 628520344 private 2566472073 [16750.940751] btrfs: disk space caching is enabled [16890.628694] btrfs: disk space caching is enabled [16892.647778] btrfs csum failed ino 66414 off 180224 csum 1270363740 private 2095194158 [17024.492935] btrfs: disk space caching is enabled [17024.584137] btrfs: free space inode generation (0) did not match free space cache generation (33) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2013-Jan-08 01:17 UTC
Re: [bug] csum mismatches and failed xfstests with 3.8-rc1 and rc2
On Mon, Jan 07, 2013 at 10:06:40AM -0700, David Sterba wrote:> On Mon, Jan 07, 2013 at 06:03:51PM +0100, David Sterba wrote: > > with top commit 5f243b9b46a22e5790dbbc36f574c2417af49a41 (something post > > -rc2) I see more checksum errors > > > > $ dmesg|grep csum|wc -l > > 100 > > more of dmesg:I think Josef tracked this down to some enospc problems during truncate. Looks like it isn''t a regression, just harder to hit now. We''ll try and confirm tomorrow. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba
2013-Jan-22 14:26 UTC
Re: [bug] csum mismatches and failed xfstests with 3.8-rc1 -rc4
On Fri, Jan 04, 2013 at 01:50:59PM +0100, David Sterba wrote:> I''ve noticed a few csum mismatch messages, and a few failed xfstests:They''re still there, we''re on rc4, so I started looking for potential patches to revert, but tonight the test reproduced csums with these patches removed: Btrfs: do not call file_update_time in aio_write Btrfs: only unlock and relock if we have to Btrfs: use tokens where we can in the tree log Btrfs: only clear dirty on the buffer if it is marked as dirty Btrfs: log changed inodes based on the extent map tree Btrfs: do not mark ems as prealloc if we are writing to them Btrfs: keep track of the extents original block length Btrfs: inline csums if we''re fsyncing Btrfs: don''t bother copying if we''re only logging the inode Btrfs: only log the inode item if we can get away with it whole branch (test-next-csum in my git repo), was created by rebasing btrfs-next/for-chris on top of linus/master commit 9a9284153d965a57edc7162a8e57c14c97f3a935. The patches were selected semi-randomly and some of them are just dependencies that made merging easier.> 113: > - it hung last evening and is still in that state, no disk or cpu activity, > there were only the tests running > - no process is in D state, no btrfs kernel thread is active > - the only interesting process is > > PID TTY STAT TIME COMMAND > 15585 pts/0 Sl+ 0:01 /root/xfstests/ltp/aio-stress -t 20 -s 10 -O -S -I \ > 1000 /mnt/a1/aiostress.15188.4 /mnt/a1/aiostress.15188.4.20 \ > /mnt/a1/aiostress.15188.4.19 /mnt/a1/aiostress.15188.4.18 \ > [<ffffffff810af447>] futex_wait_queue_me+0xc7/0x100 > [<ffffffff810affa1>] futex_wait+0x191/0x280 > [<ffffffff810b1cb6>] do_futex+0xd6/0xbd0 > [<ffffffff810b282b>] sys_futex+0x7b/0x180 > [<ffffffff8195fe99>] system_call_fastpath+0x16/0x1balso reproduced, this happens like every 3rd run of the whole test. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason
2013-Jan-22 14:39 UTC
Re: [bug] csum mismatches and failed xfstests with 3.8-rc1 -rc4
On Tue, Jan 22, 2013 at 07:26:15AM -0700, David Sterba wrote:> On Fri, Jan 04, 2013 at 01:50:59PM +0100, David Sterba wrote: > > I''ve noticed a few csum mismatch messages, and a few failed xfstests: > > They''re still there, we''re on rc4, so I started looking for potential > patches to revert, but tonight the test reproduced csums with these > patches removed:I''m able to trigger crc errors with just 50 parallel fsx O_DIRECT procs hammering in parallel. Trying to nail down the test case a little better. #!/bin/bash num=$1 if [ "x$num" == "x" ]; then num=50 fi echo "using $num procs" for x in `seq 1 $num` ; do echo -n "$x " fsx -q xxxf$x -Z -R -W -r 4096 -w 4096 & done echo "waiting" wait -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2013-Jan-23 06:16 UTC
Re: [bug] csum mismatches and failed xfstests with 3.8-rc1 -rc4
On Tue, Jan 22, 2013 at 09:39:02AM -0500, Chris Mason wrote:> On Tue, Jan 22, 2013 at 07:26:15AM -0700, David Sterba wrote: > > On Fri, Jan 04, 2013 at 01:50:59PM +0100, David Sterba wrote: > > > I''ve noticed a few csum mismatch messages, and a few failed xfstests: > > > > They''re still there, we''re on rc4, so I started looking for potential > > patches to revert, but tonight the test reproduced csums with these > > patches removed: > > I''m able to trigger crc errors with just 50 parallel fsx O_DIRECT procs > hammering in parallel. Trying to nail down the test case a little > better.After this patch(Btrfs: put csums on the right ordered extent), I''m unable to get crc errors by running this script. thanks, liubo> > #!/bin/bash > > num=$1 > > if [ "x$num" == "x" ]; then > num=50 > fi > > echo "using $num procs" > > for x in `seq 1 $num` ; do > echo -n "$x " > fsx -q xxxf$x -Z -R -W -r 4096 -w 4096 & > done > echo "waiting" > wait > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html