I''m running openSUSE 12.3 x86_64 which has an unknown git version, but reports v0.19. I''m also supplying the output from git which reports itself as: v0.20-rc1-253-g7854c8b The problem is that btrfs-show (git) and btrfs fi show (git) give /different/ output from each other which is also different from the (older) btrfs. First btrfs-show (git): ** ** WARNING: this program is considered deprecated ** Please consider to switch to the btrfs utility ** Label: none uuid: b5dc52bd-21bf-4173-8049-d54d88c82240 Total devices 2 FS bytes used 230.34GB devid 1 size 298.09GB used 298.09GB path /dev/sda *** Some devices missing Label: none uuid: 7feedf1e-9711-4900-af9c-92738ea8aace Total devices 4 FS bytes used 348.21GB devid 4 size 460.76GB used 444.23GB path /dev/sdb3 devid 3 size 460.76GB used 444.23GB path /dev/sda3 devid 6 size 298.09GB used 282.53GB path /dev/sdc devid 5 size 298.09GB used 282.53GB path /dev/sdd Btrfs v0.20-rc1-253-g7854c8b Now btrfs fi show (git): failed to open /dev/sr0: No medium found Label: none uuid: 7feedf1e-9711-4900-af9c-92738ea8aace Total devices 4 FS bytes used 348.21GB devid 5 size 298.09GB used 282.53GB path /dev/sdd devid 6 size 298.09GB used 282.53GB path /dev/sdc devid 4 size 460.76GB used 444.23GB path /dev/sdb3 devid 3 size 460.76GB used 444.23GB path /dev/sda3 Label: none uuid: b5dc52bd-21bf-4173-8049-d54d88c82240 Total devices 2 FS bytes used 230.34GB devid 1 size 298.09GB used 298.09GB path /dev/sdb *** Some devices missing Btrfs v0.20-rc1-253-g7854c8b And now the (older) btrfs fi show: failed to read /dev/sr0 Label: none uuid: 7feedf1e-9711-4900-af9c-92738ea8aace Total devices 4 FS bytes used 348.21GB devid 5 size 298.09GB used 282.53GB path /dev/sdd devid 6 size 298.09GB used 282.53GB path /dev/sdc devid 4 size 460.76GB used 444.23GB path /dev/sdb3 devid 3 size 460.76GB used 444.23GB path /dev/sda3 Btrfs v0.19+ which has similar output to the (older) btrfs-show. The differences are: 1. the order of devices varies (not a big deal) 2. the current git btrfs-show and btrfs fi show both output *different* devices for device with UUID b5dc52bd-21bf-4173-8049-d54d88c82240, and they''re both wrong. Somewhat confusingly, /dev/disk/by-uuid/ only shows three devices: bd40cb4e-93bb-4600-8455-ca1185aa8abe -> ../../md3 7feedf1e-9711-4900-af9c-92738ea8aace -> ../../sda3 4f0b27a5-5f5b-413c-a71b-b5a3bec5482c -> ../../md2 What''s going on with the varied output from (git) btrfs-show vs. ''btrfs fi show'' (vs. the older, as-shipped btrfs)? -- Jon Software Blacksmith -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 3/21/13 10:04 AM, Jon Nelson wrote:> I''m running openSUSE 12.3 x86_64 which has an unknown git version, but > reports v0.19. > I''m also supplying the output from git which reports itself as: > v0.20-rc1-253-g7854c8b > > The problem is that btrfs-show (git) and btrfs fi show (git) give > /different/ output from each other which is also different from the > (older) btrfs. > > First btrfs-show (git): > > ** > ** WARNING: this program is considered deprecated > ** Please consider to switch to the btrfs utility > ** > Label: none uuid: b5dc52bd-21bf-4173-8049-d54d88c82240 > Total devices 2 FS bytes used 230.34GB > devid 1 size 298.09GB used 298.09GB path /dev/sda > *** Some devices missing > > Label: none uuid: 7feedf1e-9711-4900-af9c-92738ea8aace > Total devices 4 FS bytes used 348.21GB > devid 4 size 460.76GB used 444.23GB path /dev/sdb3 > devid 3 size 460.76GB used 444.23GB path /dev/sda3 > devid 6 size 298.09GB used 282.53GB path /dev/sdc > devid 5 size 298.09GB used 282.53GB path /dev/sdd > > Btrfs v0.20-rc1-253-g7854c8b > > > Now btrfs fi show (git): > > failed to open /dev/sr0: No medium found > Label: none uuid: 7feedf1e-9711-4900-af9c-92738ea8aace > Total devices 4 FS bytes used 348.21GB > devid 5 size 298.09GB used 282.53GB path /dev/sdd > devid 6 size 298.09GB used 282.53GB path /dev/sdc > devid 4 size 460.76GB used 444.23GB path /dev/sdb3 > devid 3 size 460.76GB used 444.23GB path /dev/sda3 > > Label: none uuid: b5dc52bd-21bf-4173-8049-d54d88c82240 > Total devices 2 FS bytes used 230.34GB > devid 1 size 298.09GB used 298.09GB path /dev/sdb > *** Some devices missing > > Btrfs v0.20-rc1-253-g7854c8b > > > And now the (older) btrfs fi show: > > failed to read /dev/sr0 > Label: none uuid: 7feedf1e-9711-4900-af9c-92738ea8aace > Total devices 4 FS bytes used 348.21GB > devid 5 size 298.09GB used 282.53GB path /dev/sdd > devid 6 size 298.09GB used 282.53GB path /dev/sdc > devid 4 size 460.76GB used 444.23GB path /dev/sdb3 > devid 3 size 460.76GB used 444.23GB path /dev/sda3 > > Btrfs v0.19+ > > which has similar output to the (older) btrfs-show. > > > The differences are: > > 1. the order of devices varies (not a big deal)Heh, that is annoying though.> 2. the current git btrfs-show and btrfs fi show both output > *different* devices for device with UUID > b5dc52bd-21bf-4173-8049-d54d88c82240, and they''re both wrong.does blkid output find that uuid anywhere? Since you''re working in git, can you maybe do a little bisecting to find out when it changed? Should be a fairly quick test? -Eric> Somewhat confusingly, /dev/disk/by-uuid/ only shows three devices: > > bd40cb4e-93bb-4600-8455-ca1185aa8abe -> ../../md3 > 7feedf1e-9711-4900-af9c-92738ea8aace -> ../../sda3 > 4f0b27a5-5f5b-413c-a71b-b5a3bec5482c -> ../../md2 > > What''s going on with the varied output from (git) btrfs-show vs. > ''btrfs fi show'' (vs. the older, as-shipped btrfs)? > >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Mar 21, 2013 at 10:11 AM, Eric Sandeen <sandeen@redhat.com> wrote:> On 3/21/13 10:04 AM, Jon Nelson wrote:...>> 2. the current git btrfs-show and btrfs fi show both output >> *different* devices for device with UUID >> b5dc52bd-21bf-4173-8049-d54d88c82240, and they''re both wrong. > > does blkid output find that uuid anywhere? > > Since you''re working in git, can you maybe do a little bisecting > to find out when it changed? Should be a fairly quick test?blkid does /not/ report that uuid anywhere. git bisect, if I did it correctly, says: 6eba9002956ac40db87d42fb653a0524dc568810 is the first bad commit commit 6eba9002956ac40db87d42fb653a0524dc568810 Author: Goffredo Baroncelli <kreijack@inwind.it> Date: Tue Sep 4 19:59:26 2012 +0200 Correct un-initialized fsid variable :100644 100644 b21a87f827a6250da45f2fb6a1c3a6b651062243 03952051b5e25e0b67f0f910c84d93eb90de8480 M disk-io.c -- Jon Software Blacksmith -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hallo, Jon, Du meintest am 21.03.13:>>> 2. the current git btrfs-show and btrfs fi show both output >>> *different* devices for device with UUID >>> b5dc52bd-21bf-4173-8049-d54d88c82240, and they''re both wrong.>> does blkid output find that uuid anywhere?>> Since you''re working in git, can you maybe do a little bisecting >> to find out when it changed? Should be a fairly quick test?> blkid does /not/ report that uuid anywhere."blkid" seems to be a bit clueless. Try "file -s" instead. Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hallo, Jon, Du meintest am 21.03.13:> First btrfs-show (git):> ** > ** WARNING: this program is considered deprecated > ** Please consider to switch to the btrfs utility"btrfs-show" makes nasty errors, especially together with "blkid". Delete "btrfs-show". That''s the safe way. Viele Gruesse! Helmut -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 3/21/13 10:29 AM, Jon Nelson wrote:> On Thu, Mar 21, 2013 at 10:11 AM, Eric Sandeen <sandeen@redhat.com> wrote: >> On 3/21/13 10:04 AM, Jon Nelson wrote: > ... >>> 2. the current git btrfs-show and btrfs fi show both output >>> *different* devices for device with UUID >>> b5dc52bd-21bf-4173-8049-d54d88c82240, and they''re both wrong. >> >> does blkid output find that uuid anywhere? >> >> Since you''re working in git, can you maybe do a little bisecting >> to find out when it changed? Should be a fairly quick test? > > blkid does /not/ report that uuid anywhere. > > git bisect, if I did it correctly, says: > > > 6eba9002956ac40db87d42fb653a0524dc568810 is the first bad commit > commit 6eba9002956ac40db87d42fb653a0524dc568810 > Author: Goffredo Baroncelli <kreijack@inwind.it> > Date: Tue Sep 4 19:59:26 2012 +0200 > > Correct un-initialized fsid variable > > :100644 100644 b21a87f827a6250da45f2fb6a1c3a6b651062243 > 03952051b5e25e0b67f0f910c84d93eb90de8480 M disk-io.cOk, I think this is another case of greedily scanning stale backup superblocks (did you ever have btrfs on the whole sda or sdb?) btrfs_read_dev_super() currently tries to scan all 3 superblocks (primary & 2 backups). I''m guessing that you have some stale backup superblocks on sda and/or sdb. Before the above commit, if the first sb didn''t look valid, it''d skip to the 2nd. If the 2nd (stale) one looked OK, it''d compare its fsid to an uniniitialized variable (fsid) which would fail (since the "fsid" contents were random.) Same for the 3rd backup if found, and eventually it''d return -1 as failure and not report the device. After the commit, it''d skip the first invalid sb as well. But this time, it takes the fsid from the 2nd superblock as "good" and makes it through the loop thinking that it''s found something valid. Hence the report of a device which you didn''t expect even though the first superblock is indeed wiped out. There are some patches floating around to stop this backup superblock scanning altogether. This might fix it for you; it basically returns failure if any superblock on the device is found to be bad. What we really need is the right bits in the right places to let the administrator know if a device looks like it might be corrupt & in need of fixing, vs. ignoring it altogether. Not sure if this is something we want upstream but you could test if if you like. -Eric [PATCH] Fail btrfs_read_dev_super if any super looks invalid. Signed-off-by: Eric Sandeen <sandeen@redhat.com> --- diff --git a/disk-io.c b/disk-io.c index a9fd374..3aca16f 100644 --- a/disk-io.c +++ b/disk-io.c @@ -1104,7 +1104,6 @@ struct btrfs_root *open_ctree_fd(int fp, const char *path, u64 sb_bytenr, int btrfs_read_dev_super(int fd, struct btrfs_super_block *sb, u64 sb_bytenr) { u8 fsid[BTRFS_FSID_SIZE]; - int fsid_is_initialized = 0; struct btrfs_super_block buf; int i; int ret; @@ -1130,24 +1129,22 @@ int btrfs_read_dev_super(int fd, struct btrfs_super_block *sb, u64 sb_bytenr) if (ret < sizeof(buf)) break; - if (btrfs_super_bytenr(&buf) != bytenr ) - continue; - /* if magic is NULL, the device was removed */ - if (buf.magic == 0 && i == 0) + /* Bail out on any invalid superblock */ + if (btrfs_super_bytenr(&buf) != bytenr || + buf.magic != cpu_to_le64(BTRFS_MAGIC)) return -1; - if (buf.magic != cpu_to_le64(BTRFS_MAGIC)) - continue; - if (!fsid_is_initialized) { + /* if sb 0 looks valid use its fsid as trusted */ + if (i == 0) memcpy(fsid, buf.fsid, sizeof(fsid)); - fsid_is_initialized = 1; - } else if (memcmp(fsid, buf.fsid, sizeof(fsid))) { + else if (memcmp(fsid, buf.fsid, sizeof(fsid))) { /* * the superblocks (the original one and * its backups) contain data of different - * filesystems -> the super cannot be trusted + * filesystems -> the super cannot be trusted, + * hence the fs cannot be trusted. */ - continue; + return -1; } if (btrfs_super_generation(&buf) > transid) { -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> What we really need is the right bits in the right places > to let the administrator know if a device looks like it might > be corrupt & in need of fixing, vs. ignoring it altogether.-- > 2. the current git btrfs-show and btrfs fi show both output > *different* devices for device with UUID > b5dc52bd-21bf-4173-8049-d54d88c82240, and they''re both wrong. -- here per Jon experience showing stale btrfs on /dev/sdb (note that good btrfs is on /dev/sdb3) is wrong. More over when we show something like /dev/sdb and /dev/sdb3 both have two different btrfs, its a bit scary. I would vote for ignoring it altogether since we have btrfs-find-root and btrfs-select-super to help find the backup SB for recovery if needed. Thanks. -Anand -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Mar 21, 2013 at 11:25 AM, Eric Sandeen <sandeen@redhat.com> wrote:> On 3/21/13 10:29 AM, Jon Nelson wrote: >> On Thu, Mar 21, 2013 at 10:11 AM, Eric Sandeen <sandeen@redhat.com> wrote: >>> On 3/21/13 10:04 AM, Jon Nelson wrote: >> ... >>>> 2. the current git btrfs-show and btrfs fi show both output >>>> *different* devices for device with UUID >>>> b5dc52bd-21bf-4173-8049-d54d88c82240, and they''re both wrong. >>> >>> does blkid output find that uuid anywhere? >>> >>> Since you''re working in git, can you maybe do a little bisecting >>> to find out when it changed? Should be a fairly quick test? >> >> blkid does /not/ report that uuid anywhere. >> >> git bisect, if I did it correctly, says: >> >> >> 6eba9002956ac40db87d42fb653a0524dc568810 is the first bad commit >> commit 6eba9002956ac40db87d42fb653a0524dc568810 >> Author: Goffredo Baroncelli <kreijack@inwind.it> >> Date: Tue Sep 4 19:59:26 2012 +0200 >> >> Correct un-initialized fsid variable >> >> :100644 100644 b21a87f827a6250da45f2fb6a1c3a6b651062243 >> 03952051b5e25e0b67f0f910c84d93eb90de8480 M disk-io.c > > Ok, I think this is another case of greedily scanning stale > backup superblocks (did you ever have btrfs on the whole sda > or sdb?) > > btrfs_read_dev_super() currently tries to scan all 3 superblocks > (primary & 2 backups). I''m guessing that you have some stale > backup superblocks on sda and/or sdb. > > Before the above commit, if the first sb didn''t look valid, > it''d skip to the 2nd. If the 2nd (stale) one looked OK, > it''d compare its fsid to an uniniitialized variable (fsid) > which would fail (since the "fsid" contents were random.) > Same for the 3rd backup if found, and eventually it''d return > -1 as failure and not report the device. > > After the commit, it''d skip the first invalid sb as well. > But this time, it takes the fsid from the 2nd superblock as > "good" and makes it through the loop thinking that it''s found > something valid. Hence the report of a device which you didn''t > expect even though the first superblock is indeed wiped out. > > There are some patches floating around to stop this > backup superblock scanning altogether. > > This might fix it for you; it basically returns failure > if any superblock on the device is found to be bad. > > What we really need is the right bits in the right places > to let the administrator know if a device looks like it might > be corrupt & in need of fixing, vs. ignoring it altogether. > > Not sure if this is something we want upstream but you could > test if if you like.I did test and it appears to resolve the issue for me. Thank you! -- Jon Software Blacksmith -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 3/22/13 8:59 AM, Jon Nelson wrote:> On Thu, Mar 21, 2013 at 11:25 AM, Eric Sandeen <sandeen@redhat.com> wrote: >> On 3/21/13 10:29 AM, Jon Nelson wrote: >>> On Thu, Mar 21, 2013 at 10:11 AM, Eric Sandeen <sandeen@redhat.com> wrote: >>>> On 3/21/13 10:04 AM, Jon Nelson wrote: >>> ... >>>>> 2. the current git btrfs-show and btrfs fi show both output >>>>> *different* devices for device with UUID >>>>> b5dc52bd-21bf-4173-8049-d54d88c82240, and they''re both wrong. >>>> >>>> does blkid output find that uuid anywhere? >>>> >>>> Since you''re working in git, can you maybe do a little bisecting >>>> to find out when it changed? Should be a fairly quick test? >>> >>> blkid does /not/ report that uuid anywhere. >>> >>> git bisect, if I did it correctly, says: >>> >>> >>> 6eba9002956ac40db87d42fb653a0524dc568810 is the first bad commit >>> commit 6eba9002956ac40db87d42fb653a0524dc568810 >>> Author: Goffredo Baroncelli <kreijack@inwind.it> >>> Date: Tue Sep 4 19:59:26 2012 +0200 >>> >>> Correct un-initialized fsid variable >>> >>> :100644 100644 b21a87f827a6250da45f2fb6a1c3a6b651062243 >>> 03952051b5e25e0b67f0f910c84d93eb90de8480 M disk-io.c >> >> Ok, I think this is another case of greedily scanning stale >> backup superblocks (did you ever have btrfs on the whole sda >> or sdb?) >> >> btrfs_read_dev_super() currently tries to scan all 3 superblocks >> (primary & 2 backups). I''m guessing that you have some stale >> backup superblocks on sda and/or sdb. >> >> Before the above commit, if the first sb didn''t look valid, >> it''d skip to the 2nd. If the 2nd (stale) one looked OK, >> it''d compare its fsid to an uniniitialized variable (fsid) >> which would fail (since the "fsid" contents were random.) >> Same for the 3rd backup if found, and eventually it''d return >> -1 as failure and not report the device. >> >> After the commit, it''d skip the first invalid sb as well. >> But this time, it takes the fsid from the 2nd superblock as >> "good" and makes it through the loop thinking that it''s found >> something valid. Hence the report of a device which you didn''t >> expect even though the first superblock is indeed wiped out. >> >> There are some patches floating around to stop this >> backup superblock scanning altogether. >> >> This might fix it for you; it basically returns failure >> if any superblock on the device is found to be bad. >> >> What we really need is the right bits in the right places >> to let the administrator know if a device looks like it might >> be corrupt & in need of fixing, vs. ignoring it altogether. >> >> Not sure if this is something we want upstream but you could >> test if if you like. > > I did test and it appears to resolve the issue for me. > Thank you!Thanks. I need to get back to finding the right overall solution here, but have been busy elsewhere. It''s on the list ;) Anand is looking at it too and has some patches on the list. -Eric -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html