Alexander Block
2012-Jun-20 10:35 UTC
[PATCH] Allow cross subvolume reflinks (2nd attempt)
Hello, This is the second attempt to bring in cross subvolume reflinks into btrfs. The first attempt was NAKed due to missing vfs mount checks and a clear description of what btrfs subvolumes are and probably also why cross subvolume reflinks are ok in the case of btrfs. This version of the patch comes from David and is in SUSE kernels since a long time, so it is tested and working. The patch also does proper vfs mount checks, so cross mount point reflinks are not possible with this patch. It only allows cross reflinks between two subvolumes which are in the same mount point. As requested in the older discussion, I''ll try to describe what subvolumes in btrfs are. From the users point of view, a subvolume is just a directory in the mounted fs with its own inode space. This means, that two inodes with the same number from different subvolumes are different files. Parallel to the sub directory form, a subvolume may be mounted in its own vfs mount point (I''m however not sure if this is already possible or just planned). Internally, a btrfs subvolume is a separate btree, containing all the metadata needed to store the fs of the subvolume. Additionally to the subvolumes, btrfs has snapshots, which are basically the same as subvolumes, with the difference that they share metadata and data with the snapshot origin/parent. Everything that applies to subvolumes also applies to snapshots. The clone ioctl requires the destination file to be already existing and opened for writing. This means that we do not touch any vfs semantics regarding creation/opening of files. The ioctl only copies the extents information from the source to the destination, doing proper access checks before. In my opinion, we can compare the clone ioctl to the sendfile syscall, with the difference that we don''t copy data but the information where the data lies. Link to the old discussion: http://thread.gmane.org/gmane.comp.file-systems.btrfs/9864 Only PATCH 2/2 is of interrest now. The discussion also contains some arguments why cross subvolume reflinks are a good thing, but I think this should not be the matter of the new discussion. Alex. David Sterba (1): btrfs: allow cross-subvolume file clone fs/btrfs/ioctl.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) -- 1.7.10
From: David Sterba <dsterba@suse.cz> Lift the EXDEV condition and allow different root trees for files being cloned, then pass source inode''s root when searching for extents. Cloning is not allowed to cross vfsmounts, ie. when two subvolumes from one filesystem are mounted separately. Signed-off-by: David Sterba <dsterba@suse.cz> --- fs/btrfs/ioctl.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 00fd8b5..9e89255 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -2352,6 +2352,10 @@ static noinline long btrfs_ioctl_clone(struct file *file, unsigned long srcfd, goto out_drop_write; } + ret = -EXDEV; + if (src_file->f_path.mnt != file->f_path.mnt) + goto out_fput; + src = src_file->f_dentry->d_inode; ret = -EINVAL; @@ -2372,7 +2376,7 @@ static noinline long btrfs_ioctl_clone(struct file *file, unsigned long srcfd, goto out_fput; ret = -EXDEV; - if (src->i_sb != inode->i_sb || BTRFS_I(src)->root != root) + if (src->i_sb != inode->i_sb) goto out_fput; ret = -ENOMEM; @@ -2446,13 +2450,14 @@ static noinline long btrfs_ioctl_clone(struct file *file, unsigned long srcfd, * note the key will change type as we walk through the * tree. */ - ret = btrfs_search_slot(NULL, root, &key, path, 0, 0); + ret = btrfs_search_slot(NULL, BTRFS_I(src)->root, &key, path, + 0, 0); if (ret < 0) goto out; nritems = btrfs_header_nritems(path->nodes[0]); if (path->slots[0] >= nritems) { - ret = btrfs_next_leaf(root, path); + ret = btrfs_next_leaf(BTRFS_I(src)->root, path); if (ret < 0) goto out; if (ret > 0) -- 1.7.10
Goffredo Baroncelli
2012-Jun-20 17:18 UTC
Re: [PATCH] Allow cross subvolume reflinks (2nd attempt)
Hi Alexander, On 06/20/2012 12:35 PM, Alexander Block wrote:> The patch also does proper vfs mount checks, so cross mount > point reflinks are not possible with this patch. It only allows cross > reflinks between two subvolumes which are in the same mount point.Thanks for working on that. What happens if two subvolumes of the same filesystem are mounted on two different places ? I usually do: mount -o subvol=__active /dev/sdX / mount -o subvol=. /dev/sdX /var/btrfs So two different subvolumes of the same filesystem (the one on /dev/sdX) are mounted on two different places. Is it possible to do cp --reflink /tmp/foo /var/btrfs/tmp/foo2 Thanks G.Baroncelli -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Alexander Block
2012-Jun-20 17:27 UTC
Re: [PATCH] Allow cross subvolume reflinks (2nd attempt)
On Wed, Jun 20, 2012 at 7:18 PM, Goffredo Baroncelli <kreijack@libero.it> wrote:> Hi Alexander, > > On 06/20/2012 12:35 PM, Alexander Block wrote: >> The patch also does proper vfs mount checks, so cross mount >> point reflinks are not possible with this patch. It only allows cross >> reflinks between two subvolumes which are in the same mount point. > > Thanks for working on that. What happens if two subvolumes of the same > filesystem are mounted on two different places ? > > I usually do: > mount -o subvol=__active /dev/sdX / > mount -o subvol=. /dev/sdX /var/btrfs > > So two different subvolumes of the same filesystem (the one on /dev/sdX) > are mounted on two different places. > > Is it possible to do > cp --reflink /tmp/foo /var/btrfs/tmp/foo2This will fail with -EXDEV. The patch explicitly forbids reflinks crossing vfs mount boundaries.> > Thanks > G.Baroncelli >
Calvin Walton
2012-Jun-20 18:07 UTC
Re: [PATCH] Allow cross subvolume reflinks (2nd attempt)
On Wed, 2012-06-20 at 19:27 +0200, Alexander Block wrote:> On Wed, Jun 20, 2012 at 7:18 PM, Goffredo Baroncelli <kreijack@libero.it> wrote: > > Hi Alexander, > > > > On 06/20/2012 12:35 PM, Alexander Block wrote: > >> The patch also does proper vfs mount checks, so cross mount > >> point reflinks are not possible with this patch. It only allows cross > >> reflinks between two subvolumes which are in the same mount point. > > > > Thanks for working on that. What happens if two subvolumes of the same > > filesystem are mounted on two different places ? > > > > I usually do: > > mount -o subvol=__active /dev/sdX / > > mount -o subvol=. /dev/sdX /var/btrfs > > > > So two different subvolumes of the same filesystem (the one on /dev/sdX) > > are mounted on two different places. > > > > Is it possible to do > > cp --reflink /tmp/foo /var/btrfs/tmp/foo2 > This will fail with -EXDEV. The patch explicitly forbids reflinks crossing vfs > mount boundaries.Which means that if I want to do a reflink copy between two separately mounted subvolumes, I will have to go and do a third mount with a common subvolume above those two, then do the reflink operation within this extra mount? This seems slightly impractical. Is it any extra code to enable doing cross-mount reflinks? Why was the decision made not to allow them? Is there some particular opposition to supporting cross-mount operations on multiple mounts of the same filesystem in general? (I''d love to have rename() work across bind mounts, for example...) -- Calvin Walton <calvin.walton@kepstin.ca>
Goffredo Baroncelli
2012-Jun-20 18:11 UTC
Re: [PATCH] Allow cross subvolume reflinks (2nd attempt)
On 06/20/2012 08:07 PM, Calvin Walton wrote:> On Wed, 2012-06-20 at 19:27 +0200, Alexander Block wrote: >> On Wed, Jun 20, 2012 at 7:18 PM, Goffredo Baroncelli <kreijack@libero.it> wrote: >>> Hi Alexander, >>> >>> On 06/20/2012 12:35 PM, Alexander Block wrote: >>>> The patch also does proper vfs mount checks, so cross mount >>>> point reflinks are not possible with this patch. It only allows cross >>>> reflinks between two subvolumes which are in the same mount point. >>> >>> Thanks for working on that. What happens if two subvolumes of the same >>> filesystem are mounted on two different places ? >>> >>> I usually do: >>> mount -o subvol=__active /dev/sdX / >>> mount -o subvol=. /dev/sdX /var/btrfs >>> >>> So two different subvolumes of the same filesystem (the one on /dev/sdX) >>> are mounted on two different places. >>> >>> Is it possible to do >>> cp --reflink /tmp/foo /var/btrfs/tmp/foo2 >> This will fail with -EXDEV. The patch explicitly forbids reflinks crossing vfs >> mount boundaries. > > Which means that if I want to do a reflink copy between two separately > mounted subvolumes, I will have to go and do a third mount with a common > subvolume above those two, then do the reflink operation within this > extra mount? This seems slightly impractical. > > Is it any extra code to enable doing cross-mount reflinks? Why was the > decision made not to allow them? > > Is there some particular opposition to supporting cross-mount operations > on multiple mounts of the same filesystem in general? (I''d love to have > rename() work across bind mounts, for example...)Yes please, could someone explain the reason behind this decision ? May be there are valid reasons, I am asking only to know which ones ?
Alexander Block
2012-Jun-20 20:05 UTC
Re: [PATCH] Allow cross subvolume reflinks (2nd attempt)
On Wed, Jun 20, 2012 at 8:11 PM, Goffredo Baroncelli <kreijack@libero.it> wrote:> > Yes please, could someone explain the reason behind this decision ? May > be there are valid reasons, I am asking only to know which ones ? >The reason is that at the moment no user visible operations span mount boundaries and there were discussion on whether this should be allowed for the clone ioctl or not. For the moment, I would already be happy to have cross subvolume reflinks which do not span mount points, this would already be much more as we have at the moment.
Goffredo Baroncelli
2012-Jun-20 20:15 UTC
Re: [PATCH] Allow cross subvolume reflinks (2nd attempt)
On 06/20/2012 10:05 PM, Alexander Block wrote:> On Wed, Jun 20, 2012 at 8:11 PM, Goffredo Baroncelli <kreijack@libero.it> wrote: >> >> Yes please, could someone explain the reason behind this decision ? May >> be there are valid reasons, I am asking only to know which ones ? >> > The reason is that at the moment no user visible operations span mount > boundaries > and there were discussion on whether this should be allowed for the > clone ioctl or > not.Yes, I understood the same. But nobody was able to explain why . For the moment, I would already be happy to have cross subvolume reflinks> which do not span mount points, this would already be much more as we have at > the moment.This is reasonable.> . >-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Marc MERLIN
2012-Jun-27 17:21 UTC
Re: [PATCH] Allow cross subvolume reflinks (2nd attempt)
On Wed, Jun 20, 2012 at 12:35:11PM +0200, Alexander Block wrote:> Hello, > > This is the second attempt to bring in cross subvolume reflinks into btrfs. > The first attempt was NAKed due to missing vfs mount checks and a clear > description of what btrfs subvolumes are and probably also why cross > subvolume reflinks are ok in the case of btrfs. This version of the patch > comes from David and is in SUSE kernels since a long time, so it is tested > and working. The patch also does proper vfs mount checks, so cross mount > point reflinks are not possible with this patch. It only allows cross > reflinks between two subvolumes which are in the same mount point.Thank you for bringing this back. This would have saved me a lot of trouble I''m not familiar with the code, but here''s a big thumbs up from me. Anyone doing reviews, please consider this :) Thanks, Marc -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/