Ondřej Kunc
2013-Sep-28 19:49 UTC
Not possible to read device stats for devices added after mount
Hi, I discovered one minor bug in BTRFS filesystem. I made nagios check for btrfs which reads device statistics for all devices in mounted btrfs filesystem, calling btrfs dev stats /btrfs. But there is one problem ... it''s output looks like this: [/dev/sda].corruption_errs 0 .. ... [/dev/sdt].generation_errs 0 ERROR: ioctl(BTRFS_IOC_GET_DEV_STATS) on /dev/sdb2 failed: No such device ERROR: ioctl(BTRFS_IOC_GET_DEV_STATS) on /dev/sdh failed: No such device ERROR: ioctl(BTRFS_IOC_GET_DEV_STATS) on /dev/sdj failed: No such device ERROR: ioctl(BTRFS_IOC_GET_DEV_STATS) on /dev/sdk failed: No such device ERROR: ioctl(BTRFS_IOC_GET_DEV_STATS) on /dev/sdl failed: No such device ERROR: ioctl(BTRFS_IOC_GET_DEV_STATS) on /dev/sdp failed: No such device ERROR: ioctl(BTRFS_IOC_GET_DEV_STATS) on /dev/sdq failed: No such device ERROR: ioctl(BTRFS_IOC_GET_DEV_STATS) on /dev/sds failed: No such device ERROR: ioctl(BTRFS_IOC_GET_DEV_STATS) on /dev/sde failed: No such device But this is not true ... all specified devices exist and are members of btrfs filesystem. In dmesg I see this: ... [973077.098957] btrfs: get dev_stats failed, not yet valid [973077.098984] btrfs: get dev_stats failed, not yet valid [973077.099011] btrfs: get dev_stats failed, not yet valid [973077.099038] btrfs: get dev_stats failed, not yet valid [973077.099065] btrfs: get dev_stats failed, not yet valid [973077.099092] btrfs: get dev_stats failed, not yet valid [973077.099118] btrfs: get dev_stats failed, not yet valid .... What makes device statistics valid ? I tried doing full filesystem scrub ... but it did not fix that issue. Thank you for any hints Using this kernel (if it matters): 3.10-2-amd64 #1 SMP Debian 3.10.7-1 (2013-08-17) x86_64 GNU/Linux Ondřej Kunc -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> I discovered one minor bug in BTRFS filesystem.You sure did.> ERROR: ioctl(BTRFS_IOC_GET_DEV_STATS) on /dev/sde failed: No such device > > But this is not true ... all specified devices exist and are members > of btrfs filesystem. In dmesg I see this: > ... > [973077.099118] btrfs: get dev_stats failed, not yet valid > .... > > What makes device statistics valid ? I tried doing full filesystem > scrub ... but it did not fix that issue.The stats are only initialized (considered valid) for devices that are known at mount. You could unmount and mount after adding (or replacing) new devices and they''d start returning stats. The following (bad) patch illustrates the problem, but the code should be restructured so stats are reliably read as devices are added. - z From: Zach Brown <zab@redhat.com> Date: Mon, 30 Sep 2013 17:48:05 -0400 Subject: [PATCH] btrfs: init device stats for new devices Device stats are only initialized (read from tree items) on mount. Trying to read device stats after adding or replacing new devices will return errors. This cheesy patch demonstrates the problem, but this should really be a natural side-effect of adding devices to the fs_devices list. We have evidence that trying to do it by hand doesn''t work. Any preferences for how to restructure this? --- fs/btrfs/dev-replace.c | 4 +++- fs/btrfs/volumes.c | 6 ++++++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c index 5d84443..7309096 100644 --- a/fs/btrfs/dev-replace.c +++ b/fs/btrfs/dev-replace.c @@ -556,7 +556,9 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info, mutex_unlock(&dev_replace->lock_finishing_cancel_unmount); - return 0; + ret = btrfs_init_dev_stats(root->fs_info); + + return ret; } static void btrfs_dev_replace_update_device_in_mapping_tree( diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 0431147..e4ccc9b 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2126,6 +2126,9 @@ int btrfs_init_new_device(struct btrfs_root *root, char *device_path) ret = btrfs_commit_transaction(trans, root); } + if (!ret) + ret = btrfs_init_dev_stats(root->fs_info); + return ret; error_trans: @@ -6060,6 +6063,9 @@ int btrfs_init_dev_stats(struct btrfs_fs_info *fs_info) int item_size; struct btrfs_dev_stats_item *ptr; + if (device->dev_stats_valid) + continue; + key.objectid = 0; key.type = BTRFS_DEV_STATS_KEY; key.offset = device->devid; -- 1.7.11.7 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Zach, thank you for your answer and clarification. I cannot just unmount and mount that filesystem, because it is running busy NFS server now, so I will just try it on some testbench server. Can mount -o remount be sufficient (to prevent stopping service, umount, mount and starting service) ? Thank you Ondřej 2013/9/30 Zach Brown <zab@redhat.com>:>> I discovered one minor bug in BTRFS filesystem. > > You sure did. > >> ERROR: ioctl(BTRFS_IOC_GET_DEV_STATS) on /dev/sde failed: No such device >> >> But this is not true ... all specified devices exist and are members >> of btrfs filesystem. In dmesg I see this: >> ... >> [973077.099118] btrfs: get dev_stats failed, not yet valid >> .... >> >> What makes device statistics valid ? I tried doing full filesystem >> scrub ... but it did not fix that issue. > > The stats are only initialized (considered valid) for devices that are > known at mount. You could unmount and mount after adding (or replacing) > new devices and they''d start returning stats. > > The following (bad) patch illustrates the problem, but the code should > be restructured so stats are reliably read as devices are added. > > - z > > From: Zach Brown <zab@redhat.com> > Date: Mon, 30 Sep 2013 17:48:05 -0400 > Subject: [PATCH] btrfs: init device stats for new devices > > Device stats are only initialized (read from tree items) on mount. > Trying to read device stats after adding or replacing new devices will > return errors. > > This cheesy patch demonstrates the problem, but this should really be a > natural side-effect of adding devices to the fs_devices list. We have > evidence that trying to do it by hand doesn''t work. > > Any preferences for how to restructure this? > --- > fs/btrfs/dev-replace.c | 4 +++- > fs/btrfs/volumes.c | 6 ++++++ > 2 files changed, 9 insertions(+), 1 deletion(-) > > diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c > index 5d84443..7309096 100644 > --- a/fs/btrfs/dev-replace.c > +++ b/fs/btrfs/dev-replace.c > @@ -556,7 +556,9 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info, > > mutex_unlock(&dev_replace->lock_finishing_cancel_unmount); > > - return 0; > + ret = btrfs_init_dev_stats(root->fs_info); > + > + return ret; > } > > static void btrfs_dev_replace_update_device_in_mapping_tree( > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 0431147..e4ccc9b 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -2126,6 +2126,9 @@ int btrfs_init_new_device(struct btrfs_root *root, char *device_path) > ret = btrfs_commit_transaction(trans, root); > } > > + if (!ret) > + ret = btrfs_init_dev_stats(root->fs_info); > + > return ret; > > error_trans: > @@ -6060,6 +6063,9 @@ int btrfs_init_dev_stats(struct btrfs_fs_info *fs_info) > int item_size; > struct btrfs_dev_stats_item *ptr; > > + if (device->dev_stats_valid) > + continue; > + > key.objectid = 0; > key.type = BTRFS_DEV_STATS_KEY; > key.offset = device->devid; > -- > 1.7.11.7-- Ondřej Kunc -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Oct 01, 2013 at 12:03:05AM +0200, Ondřej Kunc wrote:> Hi Zach, > > thank you for your answer and clarification. I cannot just unmount and > mount that filesystem, because it is running busy NFS server now, so I > will just try it on some testbench server. Can mount -o remount be > sufficient (to prevent stopping service, umount, mount and starting > service) ?Sadly, no, remounting won''t initialize the stats. - z -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 30 Sep 2013 14:58:02 -0700, Zach Brown wrote:>> I discovered one minor bug in BTRFS filesystem. > > You sure did. > >> ERROR: ioctl(BTRFS_IOC_GET_DEV_STATS) on /dev/sde failed: No such device >> >> But this is not true ... all specified devices exist and are members >> of btrfs filesystem. In dmesg I see this: >> ... >> [973077.099118] btrfs: get dev_stats failed, not yet valid >> .... >> >> What makes device statistics valid ? I tried doing full filesystem >> scrub ... but it did not fix that issue. > > The stats are only initialized (considered valid) for devices that are > known at mount. You could unmount and mount after adding (or replacing) > new devices and they''d start returning stats. > > The following (bad) patch illustrates the problem, but the code should > be restructured so stats are reliably read as devices are added. > > - z > > From: Zach Brown <zab@redhat.com> > Date: Mon, 30 Sep 2013 17:48:05 -0400 > Subject: [PATCH] btrfs: init device stats for new devices > > Device stats are only initialized (read from tree items) on mount. > Trying to read device stats after adding or replacing new devices will > return errors. > > This cheesy patch demonstrates the problem, but this should really be a > natural side-effect of adding devices to the fs_devices list. We have > evidence that trying to do it by hand doesn''t work. > > Any preferences for how to restructure this?btrfs_init_new_device() and btrfs_init_dev_replace_tgtdev() are the two functions that allocate and initialize new btrfs_device structures after a filesystem is mounted. The device->dev_stats_valid = 1 should be done there IMO. Before, kzalloc() has set the statistic values to the correct value zero for new devices.> --- > fs/btrfs/dev-replace.c | 4 +++- > fs/btrfs/volumes.c | 6 ++++++ > 2 files changed, 9 insertions(+), 1 deletion(-) > > diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c > index 5d84443..7309096 100644 > --- a/fs/btrfs/dev-replace.c > +++ b/fs/btrfs/dev-replace.c > @@ -556,7 +556,9 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info, > > mutex_unlock(&dev_replace->lock_finishing_cancel_unmount); > > - return 0; > + ret = btrfs_init_dev_stats(root->fs_info); > + > + return ret; > } > > static void btrfs_dev_replace_update_device_in_mapping_tree( > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c > index 0431147..e4ccc9b 100644 > --- a/fs/btrfs/volumes.c > +++ b/fs/btrfs/volumes.c > @@ -2126,6 +2126,9 @@ int btrfs_init_new_device(struct btrfs_root *root, char *device_path) > ret = btrfs_commit_transaction(trans, root); > } > > + if (!ret) > + ret = btrfs_init_dev_stats(root->fs_info); > + > return ret; > > error_trans: > @@ -6060,6 +6063,9 @@ int btrfs_init_dev_stats(struct btrfs_fs_info *fs_info) > int item_size; > struct btrfs_dev_stats_item *ptr; > > + if (device->dev_stats_valid) > + continue; > + > key.objectid = 0; > key.type = BTRFS_DEV_STATS_KEY; > key.offset = device->devid; >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html