thr3ads.net - Btrfs devel - Warning: bad fsid on block 20971520 [Jan 2012]

If this information is useful, please help other people find it:
Share via:

David Sterba

2012-Jan-11 15:46 UTC

Warning: bad fsid on block 20971520

Hi,

the $subj warning appears sometimes in syslog, in my case when
xfstests/209 runs looped. The minimal reproducer is looped mkfs+mount.

The message comes from disk-io.c btree_readpage_end_io_hook():

581         if (check_tree_block_fsid(root, eb)) {
582                 printk_ratelimited(KERN_INFO "btrfs bad fsid on block
%llu\n",
583                                (unsigned long long)eb->start);
584                 ret = -EIO;
585                 goto err;
586         }

relevant syslog messages:

[420367.199710] device fsid dda1a3db-3106-4bb9-8ecf-2e823938d538 devid 1 transid
4 /dev/sda9
[420367.209438] btrfs: force lzo compression
[420367.214695] btrfs: enabling inode map caching
[420367.220404] btrfs: enabling auto defrag
[420367.224193] btrfs: disk space caching is enabled
[420367.323356] btrfs bad fsid on block 20971520
[420367.358349] btrfs bad fsid on block 20971520
[420367.368272] btrfs bad fsid on block 20971520
[420367.376239] btrfs bad fsid on block 20971520
[420367.381836] btrfs bad fsid on block 20971520
[420367.467332] btrfs bad fsid on block 20971520
[420367.473249] btrfs bad fsid on block 20971520
[420367.478649] btrfs: failed to read chunk root on sda9
[420367.487810] btrfs: open_ctree failed

and mount fails.

/proc/partitions:
   8        9   10485760 sda9

10485760*1024 bytes, which is 2621440 4k blocks.

The number 20971520 is not a block number, rather byte offset, so the
message might be confusing first. Real block number is
20971520 / 4096 = 20480

used blocks on a freshly created device:

File size of test-10g is 10737418240 (10485760 blocks, blocksize 1024)
 ext logical physical expected length flags
   0       0 10248192            2048
   1    4096 10252288 10250239     20
   2   20480 10293248 10252307      4
   3   28672 10301440 10293251      4
   4   36864 10260480 10301443     24
   5   65536 10264576 10260503      4
   6 1085440 10276864 10264579     24
   7 10483712   294912 10276887   2048 eof

it''s "extent" nr. 2, not a superblock. The block obviously
does not
contain the exptected data, though they were submitted and supposed to
be written by the mkfs step. The question is, where the update is lost --
block layer, write caches of the disk.

btrfs-debug-tree says it''s:

chunk tree
leaf 20971520 items 6 free space 3283 generation 4 owner 3
fs uuid 024cd2e6-d584-493c-af81-fa3e2f548abb
chunk uuid 3f52ec70-89a9-4cd5-b5f0-177d7ae63de3
        item 0 key (DEV_ITEMS DEV_ITEM 1) itemoff 3897 itemsize 98
                dev item devid 1 total_bytes 10737418240 bytes used 2185232384
        item 1 key (FIRST_CHUNK_TREE CHUNK_ITEM 0) itemoff 3817 itemsize 80
                chunk length 4194304 owner 2 type 2 num_stripes 1
                        stripe 0 devid 1 offset 0
        item 2 key (FIRST_CHUNK_TREE CHUNK_ITEM 4194304) itemoff 3737 itemsize
80
                chunk length 8388608 owner 2 type 4 num_stripes 1
                        stripe 0 devid 1 offset 4194304
        item 3 key (FIRST_CHUNK_TREE CHUNK_ITEM 12582912) itemoff 3657 itemsize
80
                chunk length 8388608 owner 2 type 1 num_stripes 1
                        stripe 0 devid 1 offset 12582912
        item 4 key (FIRST_CHUNK_TREE CHUNK_ITEM 20971520) itemoff 3545 itemsize
112
                chunk length 8388608 owner 2 type 34 num_stripes 2
                        stripe 0 devid 1 offset 20971520
                        stripe 1 devid 1 offset 29360128
        item 5 key (FIRST_CHUNK_TREE CHUNK_ITEM 29360128) itemoff 3433 itemsize
112
                chunk length 1073741824 owner 2 type 36 num_stripes 2
                        stripe 0 devid 1 offset 37748736
                        stripe 1 devid 1 offset 1111490560


david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Mason

2012-Jan-11 16:37 UTC

head link

Re: Warning: bad fsid on block 20971520

On Wed, Jan 11, 2012 at 04:46:21PM +0100, David Sterba
wrote:> Hi,
> 
> the $subj warning appears sometimes in syslog, in my case when
> xfstests/209 runs looped. The minimal reproducer is looped mkfs+mount.
I''ve been seeing this as well.  It''s new with 3.2, and I
haven''t yet
been able to track it down.

The first thing that happens when we mount the FS is a block layer
invalidate, and that must be dropping the write.

It''s also possible (but very unlikely) that mkfs.btrfs is neglecting to
write that block.

Do you have a reliable way to reproduce?  If so, can you try with a much
older mkfs.btrfs.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Sterba

2012-Jan-11 22:34 UTC

head link

Re: Warning: bad fsid on block 20971520

On Wed, Jan 11, 2012 at 11:37:14AM -0500, Chris Mason
wrote:> I''ve been seeing this as well.  It''s new with 3.2, and I
haven''t yet
> been able to track it down.
> 
> The first thing that happens when we mount the FS is a block layer
> invalidate, and that must be dropping the write.
> 
> It''s also possible (but very unlikely) that mkfs.btrfs is
neglecting to
> write that block.
> 
> Do you have a reliable way to reproduce?  If so, can you try with a much
> older mkfs.btrfs.
I built mkfs from v0.19 and let 209 loop again, the error appeared 6x
during 3 hours, the last 5 occurences within 10 minutes.  Should I try
even older mkfs?

I will try to catch it with blktrace running.

thanks,
david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Chris Mason

2012-Jan-11 22:54 UTC

head link

Re: Warning: bad fsid on block 20971520

On Wed, Jan 11, 2012 at 11:34:34PM +0100, David Sterba
wrote:> On Wed, Jan 11, 2012 at 11:37:14AM -0500, Chris Mason wrote:
> > I''ve been seeing this as well.  It''s new with 3.2,
and I haven''t yet
> > been able to track it down.
> > 
> > The first thing that happens when we mount the FS is a block layer
> > invalidate, and that must be dropping the write.
> > 
> > It''s also possible (but very unlikely) that mkfs.btrfs is
neglecting to
> > write that block.
> > 
> > Do you have a reliable way to reproduce?  If so, can you try with a
much
> > older mkfs.btrfs.
> 
> I built mkfs from v0.19 and let 209 loop again, the error appeared 6x
> during 3 hours, the last 5 occurences within 10 minutes.  Should I try
> even older mkfs?
Nah, I''d try the latest mkfs on a 3.0 kernel.  I think its a change to
the block device invalidate code that happens on mount.

-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Sterba

2012-Jan-16 14:34 UTC

head link

Re: Warning: bad fsid on block 20971520

On Wed, Jan 11, 2012 at 11:34:34PM +0100, David Sterba
wrote:> I will try to catch it with blktrace running.
Measurement disrupted the experiment. The second I start blktrace, these
messages

[450482.299863] device fsid 7f7bfb60-b8f3-457e-857a-9a1a187f750f devid 1 transid
7 /dev/sda9
[450482.309642] btrfs: force lzo compression
[450482.314802] btrfs: enabling inode map caching
[450482.320363] btrfs: enabling auto defrag
[450482.324138] btrfs: disk space caching is enabled
[450482.378652] btrfs: failed to read chunk root on sda9
[450482.385397] btrfs: open_ctree failed 

appear in the log, mount fails and the test is not performed. (And
immediately stop when blktrace stops.)

There are a few occurances of the

[450491.373282] btrfs bad fsid on block 20971520

message. Blktrace log does not contain any record of ''mkfs''
activity,
the other involved process are there (mount, aio-dio, kernel threads).

The other day I saw

[ 8334.490486] device fsid 830e57b6-b9c3-471c-b4dc-4a8c2c56fb35 devid 1 transid
4 /dev/sda9
[ 8334.500482] btrfs: force lzo compression 
[ 8334.505853] btrfs: enabling inode map caching
[ 8334.511539] btrfs: enabling auto defrag
[ 8334.515234] btrfs: disk space caching is enabled
[ 8334.532517] btrfs  0 12582912
[ 8334.551594] btrfs bad tree block start 20971520 12582912
[ 8334.560353] btrfs bad tree block start 0 12582912
[ 8334.568263] btrfs bad tree block start 20971520 12582912
[ 8334.575188] btrfs bad tree block start 20971520 12582912
[ 8334.581946] btrfs bad tree block start 0 12582912
[ 8334.588044] btrfs bad tree block start 20971520 12582912
[ 8334.594543] btrfs: failed to read chunk root on sda9
[ 8334.601040] btrfs: open_ctree failed

a different instance of the the same.  So this looks rather serious.

Per your advice, I''ll try to test with other filesystems, with older
kernels, and in btrfs case add fsync into mkfs.


david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

David Sterba

2012-Jan-16 18:36 UTC

head link

Re: Warning: bad fsid on block 20971520

On Mon, Jan 16, 2012 at 03:34:28PM +0100, David Sterba
wrote:> Per your advice, I''ll try to test with other filesystems, with
older
> kernels, and in btrfs case add fsync into mkfs.
I left looping the 3.0.13 based sles kernel and did not trigger the
warning for several hours.

In the meantime I grepped through my serial console logs and found that
first ''bad fsid'' message appeared in 3.0.0-rc5+ dated to
2011-06-01:

[73673.623530] device fsid 5f1b5c0e-21ff-4896-bf48-8d64558dd205 devid 1 transid
7 /dev/sdb10
[73673.633194] btrfs: enabling auto defrag
[73673.636915] btrfs: enabling disk space caching
[73673.644124] btrfs: enabling inode map caching
[73673.649733] btrfs: force lzo compression
[73673.740630] btrfs bad fsid on block 20971520
[73673.746400] btrfs bad fsid on block 20971520
[73673.760785] btrfs bad fsid on block 20971520
[73673.766284] btrfs: failed to read chunk root on sdb10
[73673.772969] btrfs warning page private not zero on page 20971520
[73673.792224] btrfs: open_ctree failed

and there are several messages from 3.1.0-rc4 kernel, no more occurences
of "page private not zero" message.


david
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Seemingly Similar Threads

Search for more maybe matching threads

Btrfs devel - Jan 2012 - Warning: bad fsid on block 20971520

Warning: bad fsid on block 20971520

Re: Warning: bad fsid on block 20971520

Re: Warning: bad fsid on block 20971520

Re: Warning: bad fsid on block 20971520

Re: Warning: bad fsid on block 20971520

Re: Warning: bad fsid on block 20971520

Seemingly Similar Threads