Liu Bo
2012-Aug-23 08:56 UTC
[PATCH] Btrfs-progs: add options to change size in logical to inode transition
Add an option ''s'' to set size in logical to inode transition, then we are able to read all the refs to the logical address. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> --- cmds-inspect.c | 18 ++++++++++++------ 1 files changed, 12 insertions(+), 6 deletions(-) diff --git a/cmds-inspect.c b/cmds-inspect.c index 2f0228f..be5e588 100644 --- a/cmds-inspect.c +++ b/cmds-inspect.c @@ -115,7 +115,7 @@ static int cmd_inode_resolve(int argc, char **argv) } static const char * const cmd_logical_resolve_usage[] = { - "btrfs inspect-internal logical-resolve [-Pv] <logical> <path>", + "btrfs inspect-internal logical-resolve [-Pv] [-s size] <logical> <path>", "Get file system paths for the given logical address", NULL }; @@ -130,12 +130,13 @@ static int cmd_logical_resolve(int argc, char **argv) int bytes_left; struct btrfs_ioctl_logical_ino_args loi; struct btrfs_data_container *inodes; + u64 size = 4096; char full_path[4096]; char *path_ptr; optind = 1; while (1) { - int c = getopt(argc, argv, "Pv"); + int c = getopt(argc, argv, "Pvs:"); if (c < 0) break; @@ -146,6 +147,9 @@ static int cmd_logical_resolve(int argc, char **argv) case ''v'': verbose = 1; break; + case ''s'': + size = atoll(optarg); + break; default: usage(cmd_logical_resolve_usage); } @@ -154,12 +158,13 @@ static int cmd_logical_resolve(int argc, char **argv) if (check_argc_exact(argc - optind, 2)) usage(cmd_logical_resolve_usage); - inodes = malloc(4096); + size = max(size, 4096); + inodes = malloc(size); if (!inodes) return 1; loi.logical = atoll(argv[optind]); - loi.size = 4096; + loi.size = size; loi.inodes = (u64)inodes; fd = open_file_or_dir(argv[optind+1]); @@ -176,8 +181,9 @@ static int cmd_logical_resolve(int argc, char **argv) } if (verbose) - printf("ioctl ret=%d, bytes_left=%lu, bytes_missing=%lu, " - "cnt=%d, missed=%d\n", ret, + printf("ioctl ret=%d, total_size=%llu, bytes_left=%lu, " + "bytes_missing=%lu, cnt=%d, missed=%d\n", + ret, size, (unsigned long)inodes->bytes_left, (unsigned long)inodes->bytes_missing, inodes->elem_cnt, inodes->elem_missed); -- 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2012-Aug-23 08:56 UTC
[PATCH] Btrfs: use larger limit for transition of logical to inode
This is the change of the kernel side. Transition of logical to inode used to have a limit 4096 on inode container''s size, but the limit is not large enough for a data with a great many of refs, so when resolving logical address, we can end up with "ioctl ret=0, bytes_left=0, bytes_missing=19944, cnt=510, missed=2493" This changes to regard 4096 as the lowest limit. Signed-off-by: Liu Bo <bo.li.liu@oracle.com> --- fs/btrfs/ioctl.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 9449b84..525915f 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -3232,7 +3232,7 @@ static long btrfs_ioctl_logical_to_ino(struct btrfs_root *root, goto out; } - size = min_t(u32, loi->size, 4096); + size = max_t(u32, loi->size, 4096); inodes = init_data_container(size); if (IS_ERR(inodes)) { ret = PTR_ERR(inodes); -- 1.7.7.6 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jan Schmidt
2012-Aug-23 10:07 UTC
Re: [PATCH] Btrfs: use larger limit for transition of logical to inode
On Thu, August 23, 2012 at 10:56 (+0200), Liu Bo wrote:> This is the change of the kernel side. > > Transition of logical to inode used to have a limit 4096 on inode container''s > size, but the limit is not large enough for a data with a great many of refs, > so when resolving logical address, we can end up with > "ioctl ret=0, bytes_left=0, bytes_missing=19944, cnt=510, missed=2493" > > This changes to regard 4096 as the lowest limit. > > Signed-off-by: Liu Bo <bo.li.liu@oracle.com> > --- > fs/btrfs/ioctl.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c > index 9449b84..525915f 100644 > --- a/fs/btrfs/ioctl.c > +++ b/fs/btrfs/ioctl.c > @@ -3232,7 +3232,7 @@ static long btrfs_ioctl_logical_to_ino(struct btrfs_root *root, > goto out; > } > > - size = min_t(u32, loi->size, 4096); > + size = max_t(u32, loi->size, 4096);Hum. I added this because I wanted to avoid allocations > PAGE_SIZE. We''re doing kmalloc GFP_NOFS with whatever one enters as size, I''m not sure that''s a good idea without any sanitizing. Second, we should probably add a fall back option to vmalloc, in case kmalloc fails? Or should we even go for vmalloc directly, what do you think? Thanks, -Jan> inodes = init_data_container(size); > if (IS_ERR(inodes)) { > ret = PTR_ERR(inodes);-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2012-Aug-23 10:24 UTC
Re: [PATCH] Btrfs: use larger limit for transition of logical to inode
On 08/23/2012 06:07 PM, Jan Schmidt wrote:> On Thu, August 23, 2012 at 10:56 (+0200), Liu Bo wrote: >> This is the change of the kernel side. >> >> Transition of logical to inode used to have a limit 4096 on inode container''s >> size, but the limit is not large enough for a data with a great many of refs, >> so when resolving logical address, we can end up with >> "ioctl ret=0, bytes_left=0, bytes_missing=19944, cnt=510, missed=2493" >> >> This changes to regard 4096 as the lowest limit. >> >> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> >> --- >> fs/btrfs/ioctl.c | 2 +- >> 1 files changed, 1 insertions(+), 1 deletions(-) >> >> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c >> index 9449b84..525915f 100644 >> --- a/fs/btrfs/ioctl.c >> +++ b/fs/btrfs/ioctl.c >> @@ -3232,7 +3232,7 @@ static long btrfs_ioctl_logical_to_ino(struct btrfs_root *root, >> goto out; >> } >> >> - size = min_t(u32, loi->size, 4096); >> + size = max_t(u32, loi->size, 4096); > > Hum. I added this because I wanted to avoid allocations > PAGE_SIZE. We''re doing > kmalloc GFP_NOFS with whatever one enters as size, I''m not sure that''s a good > idea without any sanitizing. >Yeah, I agree. So we do need to make some sanity checks, according to my tests, we need about 30k to resolve a file shared by 4000 snapshots. What about 32k as a upside limit?> Second, we should probably add a fall back option to vmalloc, in case kmalloc > fails? Or should we even go for vmalloc directly, what do you think? >Given loi->size is not reliable, going for vmalloc for an ioctl is reasonable. thanks, liubo> Thanks, > -Jan > >> inodes = init_data_container(size); >> if (IS_ERR(inodes)) { >> ret = PTR_ERR(inodes); > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Liu Bo
2012-Aug-23 10:30 UTC
Re: [PATCH] Btrfs: use larger limit for transition of logical to inode
On 08/23/2012 06:24 PM, Liu Bo wrote:> On 08/23/2012 06:07 PM, Jan Schmidt wrote: >> On Thu, August 23, 2012 at 10:56 (+0200), Liu Bo wrote: >>> This is the change of the kernel side. >>> >>> Transition of logical to inode used to have a limit 4096 on inode container''s >>> size, but the limit is not large enough for a data with a great many of refs, >>> so when resolving logical address, we can end up with >>> "ioctl ret=0, bytes_left=0, bytes_missing=19944, cnt=510, missed=2493" >>> >>> This changes to regard 4096 as the lowest limit. >>> >>> Signed-off-by: Liu Bo <bo.li.liu@oracle.com> >>> --- >>> fs/btrfs/ioctl.c | 2 +- >>> 1 files changed, 1 insertions(+), 1 deletions(-) >>> >>> diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c >>> index 9449b84..525915f 100644 >>> --- a/fs/btrfs/ioctl.c >>> +++ b/fs/btrfs/ioctl.c >>> @@ -3232,7 +3232,7 @@ static long btrfs_ioctl_logical_to_ino(struct btrfs_root *root, >>> goto out; >>> } >>> >>> - size = min_t(u32, loi->size, 4096); >>> + size = max_t(u32, loi->size, 4096); >> >> Hum. I added this because I wanted to avoid allocations > PAGE_SIZE. We''re doing >> kmalloc GFP_NOFS with whatever one enters as size, I''m not sure that''s a good >> idea without any sanitizing. >> > > Yeah, I agree. > > So we do need to make some sanity checks, according to my tests, > we need about 30k to resolve a file shared by 4000 snapshots. >s/4000/2000/g> What about 32k as a upside limit? > >> Second, we should probably add a fall back option to vmalloc, in case kmalloc >> fails? Or should we even go for vmalloc directly, what do you think? >> > > Given loi->size is not reliable, going for vmalloc for an ioctl is reasonable. > > thanks, > liubo > >> Thanks, >> -Jan >> >>> inodes = init_data_container(size); >>> if (IS_ERR(inodes)) { >>> ret = PTR_ERR(inodes); >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba
2012-Aug-24 16:46 UTC
Re: [PATCH] Btrfs: use larger limit for transition of logical to inode
On Thu, Aug 23, 2012 at 06:24:02PM +0800, Liu Bo wrote:> > Hum. I added this because I wanted to avoid allocations > PAGE_SIZE. We''re doing > > kmalloc GFP_NOFS with whatever one enters as size, I''m not sure that''s a good > > idea without any sanitizing. > > Yeah, I agree. > > So we do need to make some sanity checks, according to my tests, > we need about 30k to resolve a file shared by 4000 snapshots. > > What about 32k as a upside limit?64k? As it''s an upper limit, let''s make it a bit higher than what one can reach normally.> > Second, we should probably add a fall back option to vmalloc, in case kmalloc > > fails? Or should we even go for vmalloc directly, what do you think? > > Given loi->size is not reliable, going for vmalloc for an ioctl is reasonable.Yeah, vmalloc is ok here, we can safely afford to return ENOMEM. AFAICS, we could possibly reuse the userspace buffer, iff build_ino_list uses copy_to_user instead of directly writing to the data buffer. This would eliminate the kernel-side ENOMEM completely at the cost of function call overhead and the access_ok checks. And compared to the simplicity of just vmalloc with rare occurence of ENOMEM, I don''t think it''s worth the work. david -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Sterba
2012-Aug-24 16:57 UTC
Re: [PATCH] Btrfs-progs: add options to change size in logical to inode transition
On Thu, Aug 23, 2012 at 04:56:28PM +0800, Liu Bo wrote:> Add an option ''s'' to set size in logical to inode transition, then we are able > to read all the refs to the logical address. > > Signed-off-by: Liu Bo <bo.li.liu@oracle.com> > --- > cmds-inspect.c | 18 ++++++++++++------ > 1 files changed, 12 insertions(+), 6 deletions(-) > > diff --git a/cmds-inspect.c b/cmds-inspect.c > index 2f0228f..be5e588 100644 > --- a/cmds-inspect.c > +++ b/cmds-inspect.c > @@ -115,7 +115,7 @@ static int cmd_inode_resolve(int argc, char **argv) > } > > static const char * const cmd_logical_resolve_usage[] = { > - "btrfs inspect-internal logical-resolve [-Pv] <logical> <path>", > + "btrfs inspect-internal logical-resolve [-Pv] [-s size] <logical> <path>",Calling it ''size'' si very ambiguous and I needed to go through the code to see where and how the size is actually used. Something like ''bufsize'' would be more understandable for a regular user.> "Get file system paths for the given logical address", > NULLand the option should be documented here, saying that if the list is incomplete, user can increase it and the maximum value (when we agree on it)> }; > @@ -130,12 +130,13 @@ static int cmd_logical_resolve(int argc, char **argv) > int bytes_left; > struct btrfs_ioctl_logical_ino_args loi; > struct btrfs_data_container *inodes; > + u64 size = 4096; > char full_path[4096]; > char *path_ptr; > > optind = 1; > while (1) { > - int c = getopt(argc, argv, "Pv"); > + int c = getopt(argc, argv, "Pvs:"); > if (c < 0) > break; > > @@ -146,6 +147,9 @@ static int cmd_logical_resolve(int argc, char **argv) > case ''v'': > verbose = 1; > break; > + case ''s'': > + size = atoll(optarg); > + break; > default: > usage(cmd_logical_resolve_usage); > } > @@ -154,12 +158,13 @@ static int cmd_logical_resolve(int argc, char **argv) > if (check_argc_exact(argc - optind, 2)) > usage(cmd_logical_resolve_usage); > > - inodes = malloc(4096); > + size = max(size, 4096); > + inodes = malloc(size);Do we need the sanity checks here as well? Actually, if we avoid the kernel buffer allocation, then user can supply a buffer of any size here and we can drop the upper limit. This would be a bit more future proof, when people will generate not thousands but milions of snapshots :)> if (!inodes) > return 1; > > loi.logical = atoll(argv[optind]); > - loi.size = 4096; > + loi.size = size; > loi.inodes = (u64)inodes; > > fd = open_file_or_dir(argv[optind+1]); > @@ -176,8 +181,9 @@ static int cmd_logical_resolve(int argc, char **argv) > } > > if (verbose) > - printf("ioctl ret=%d, bytes_left=%lu, bytes_missing=%lu, " > - "cnt=%d, missed=%d\n", ret, > + printf("ioctl ret=%d, total_size=%llu, bytes_left=%lu, " > + "bytes_missing=%lu, cnt=%d, missed=%d\n", > + ret, size, > (unsigned long)inodes->bytes_left, > (unsigned long)inodes->bytes_missing, > inodes->elem_cnt, inodes->elem_missed); > -- > 1.7.7.6 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html