COW and checksumming is enabled by default, and could be disable with mount option. These new ioctls below let a user application turn on/off copy-on-write and checksumming on a per file basis. Signed-off-by: Mingming Cao <cmm@us.ibm.com> diff -r 79b81fd6dd78 ioctl.c --- a/ioctl.c Thu Jun 12 14:46:17 2008 -0400 +++ b/ioctl.c Thu Jun 19 15:59:02 2008 -0700 @@ -471,6 +471,55 @@ static int btrfs_ioctl_defrag(struct fil return 0; } + +static int btrfs_ioctl_disable_datacow(struct file *file) +{ + struct inode *inode = fdentry(file)->d_inode; + + if ((inode->i_mode & S_IFMT) == S_IFREG) { + mutex_lock(&inode->i_mutex); + btrfs_set_flag(inode, NODATACOW); + mutex_unlock(&inode->i_mutex); + } + + return 0; +} + +static int btrfs_ioctl_enable_datacow(struct file *file) +{ + struct inode *inode = fdentry(file)->d_inode; + + if ((inode->i_mode & S_IFMT) == S_IFREG) { + mutex_lock(&inode->i_mutex); + btrfs_clear_flag(inode, NODATACOW); + mutex_unlock(&inode->i_mutex); + } + + return 0; +} + +static int btrfs_ioctl_disable_datasum(struct file *file) +{ + struct inode *inode = fdentry(file)->d_inode; + + mutex_lock(&inode->i_mutex); + btrfs_set_flag(inode, NODATASUM); + mutex_unlock(&inode->i_mutex); + + return 0; +} + +static int btrfs_ioctl_enable_datasum(struct file *file) +{ + struct inode *inode = fdentry(file)->d_inode; + + mutex_lock(&inode->i_mutex); + btrfs_clear_flag(inode, NODATASUM); + mutex_unlock(&inode->i_mutex); + + return 0; +} + long btrfs_ioctl_add_dev(struct btrfs_root *root, void __user *arg) { @@ -775,6 +824,14 @@ long btrfs_ioctl(struct file *file, unsi return btrfs_ioctl_trans_start(file); case BTRFS_IOC_TRANS_END: return btrfs_ioctl_trans_end(file); + case BTRFS_IOC_NODATACOW: + return btrfs_ioctl_disable_datacow(file); + case BTRFS_IOC_DATACOW: + return btrfs_ioctl_enable_datacow(file); + case BTRFS_IOC_NODATASUM: + return btrfs_ioctl_disable_datasum(file); + case BTRFS_IOC_DATASUM: + return btrfs_ioctl_enable_datasum(file); case BTRFS_IOC_SYNC: btrfs_sync_fs(file->f_dentry->d_sb, 1); return 0; diff -r 79b81fd6dd78 ioctl.h --- a/ioctl.h Thu Jun 12 14:46:17 2008 -0400 +++ b/ioctl.h Thu Jun 19 15:32:40 2008 -0700 @@ -51,5 +51,8 @@ struct btrfs_ioctl_vol_args { struct btrfs_ioctl_vol_args) #define BTRFS_IOC_BALANCE _IOW(BTRFS_IOCTL_MAGIC, 12, \ struct btrfs_ioctl_vol_args) - +#define BTRFS_IOC_NODATACOW _IO(BTRFS_IOCTL_MAGIC, 13) +#define BTRFS_IOC_DATACOW _IO(BTRFS_IOCTL_MAGIC, 14) +#define BTRFS_IOC_NODATASUM _IO(BTRFS_IOCTL_MAGIC, 15) +#define BTRFS_IOC_DATASUM _IO(BTRFS_IOCTL_MAGIC, 16) #endif -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> +#define BTRFS_IOC_NODATACOW _IO(BTRFS_IOCTL_MAGIC, 13) > +#define BTRFS_IOC_DATACOW _IO(BTRFS_IOCTL_MAGIC, 14) > +#define BTRFS_IOC_NODATASUM _IO(BTRFS_IOCTL_MAGIC, 15) > +#define BTRFS_IOC_DATASUM _IO(BTRFS_IOCTL_MAGIC, 16)Hmm. Do we really want 4 different ioctl commands to turn 2 features on and off? Surely we could have 1 ioctl which updates a bitfield? Or a ioctl that takes an explicit feature enum argument and a boolean which indicates that it should be enabled or not? (And is ioctl the right interface for this? maybe it should be xattr ops in some defined btrfs string namespace? I''m just making this up. I feel the the lack of a single comment in the patch, while in keeping with the existing precedent in btrfs, leaves a lot of room for wild speculation :)) - z -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jun 20, 2008 at 08:01:54AM -0600, Joe Peterson wrote:> Zach Brown wrote: > > Hmm. Do we really want 4 different ioctl commands to turn 2 features on > > and off? Surely we could have 1 ioctl which updates a bitfield? Or a > > ioctl that takes an explicit feature enum argument and a boolean which > > indicates that it should be enabled or not? > > > > (And is ioctl the right interface for this? maybe it should be xattr > > ops in some defined btrfs string namespace? I''m just making this up. I > > feel the the lack of a single comment in the patch, while in keeping > > with the existing precedent in btrfs, leaves a lot of room for wild > > speculation :)) > > Beyond that, is it really desirable to be able to turn off > checksumming/COW per file? Or even to be able to do it so easily? >Database apps that do their own complicated stuff to make sure everything makes it to the disk properly who dont want the extra overhead of checksumming.> I feel that checksumming is an extremely important feature, and it > scares me even to be able to disable it at all (what if someone > mistakenly does it, and then all of the data is unintentionally > vulnerable?). But at least if it is only a mount option, the mistake > would have to be at as system level and would be harder to do by accident. >What if somebody logs in as root and does an rm -rf? I''m not thinking that running the command to disable checksumming on a file will be something that gets run often by accident, but even if it does the mantra of linux in general has never been "dont let users do something because some idiot could screw it all up."> If someone were to turn off checksumming and then turn it back on, I > assume they would lose any continuity of protection that exists if it > stays on, so it would have to be a major and important decision in an > existing filesystem. >I don''t really understand the above objection. Checksumming doesn''t make everything magically protected, just makes it easier to catch when problems are happening. I don''t see how having it off then turning it on would cause any sorts of issues. Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Zach Brown wrote:> Hmm. Do we really want 4 different ioctl commands to turn 2 features on > and off? Surely we could have 1 ioctl which updates a bitfield? Or a > ioctl that takes an explicit feature enum argument and a boolean which > indicates that it should be enabled or not? > > (And is ioctl the right interface for this? maybe it should be xattr > ops in some defined btrfs string namespace? I''m just making this up. I > feel the the lack of a single comment in the patch, while in keeping > with the existing precedent in btrfs, leaves a lot of room for wild > speculation :))Beyond that, is it really desirable to be able to turn off checksumming/COW per file? Or even to be able to do it so easily? I feel that checksumming is an extremely important feature, and it scares me even to be able to disable it at all (what if someone mistakenly does it, and then all of the data is unintentionally vulnerable?). But at least if it is only a mount option, the mistake would have to be at as system level and would be harder to do by accident. If someone were to turn off checksumming and then turn it back on, I assume they would lose any continuity of protection that exists if it stays on, so it would have to be a major and important decision in an existing filesystem. -Joe -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Josef Bacik wrote:> Database apps that do their own complicated stuff to make sure everything makes > it to the disk properly who dont want the extra overhead of checksumming.I do see the benefit of that, if indeed the DB does end-to-end checking. Then again, I could almost see that perhaps making the setting based on something at the subvolume level (not per-file level) might be even better for that case.> What if somebody logs in as root and does an rm -rf? I''m not thinking that > running the command to disable checksumming on a file will be something that > gets run often by accident, but even if it does the mantra of linux in general > has never been "dont let users do something because some idiot could screw it > all up."Sure, that''s true. I am not saying that we need to protect users from themselves, as long as there is a way to clearly see what the settings are (so a user/admin can verify the state, e.g.). I guess my main concern would be if it is relatively easy for this to go completely unnoticed. Also, would it be only root who could change such a setting?> I don''t really understand the above objection. Checksumming doesn''t make > everything magically protected, just makes it easier to catch when problems are > happening. I don''t see how having it off then turning it on would cause any > sorts of issues.Well, with checksumming on, there is continuous protection. And what I mean by "protection" is peace-of-mind that the data integrity has been checked (I don''t necessarily mean protected from loss). For me, *knowing* that the data is bad is the most important thing, so it does not silently propagate to backups, etc. So, if a file, say, is "protected" by checksumming as it gets read, changed, etc., there is a continuity of protection, and the user can have confidence that the data read is the same as the data written. On the other hand, if one turns off checksumming, there is then a break in that continuity, so any further reads or writes will not be be checked. Even if the file is not written while not having checksumming enabled, but then it is re-enabled, now there is no way to know that the file''s contents are the same as what they were before checksumming was turned off - that confidence is lost. -Joe -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2008-06-20 at 09:07 -0600, Joe Peterson wrote:> Josef Bacik wrote: > > Database apps that do their own complicated stuff to make sure everything makes > > it to the disk properly who dont want the extra overhead of checksumming. > > I do see the benefit of that, if indeed the DB does end-to-end checking. > Then again, I could almost see that perhaps making the setting based on > something at the subvolume level (not per-file level) might be even > better for that case. >But, it''ll be common to mix database files with files that you do want checksummed in the same volume. This is an admin level decision, and we can easily provide knobs to turn it on/off. So, I think the ioctl is really important and we''ll just document it as best we can. Long term, these might end up getting folded into chattr, so I think Zach has a good point about folding them into a single change-file-attributes ioctl. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Joe Peterson wrote:> Then again, I could almost see that perhaps making the setting based on > something at the subvolume level (not per-file level) might be even > better for that case....> Sure, that''s true. I am not saying that we need to protect users from > themselves, as long as there is a way to clearly see what the settings > are (so a user/admin can verify the state, e.g.). I guess my main > concern would be if it is relatively easy for this to go completely > unnoticed. Also, would it be only root who could change such a setting?It is important to have the per-file granularity. I agree with you that being able to apply this to "all members of this set", where the "set" is whatever is practical for btrfs is a nice ease-of-use feature. The ability to do the change is probably based on "right to change file attributes", which depends on security policy, but it should not be designed as a root-only restriction. As I think someone already said (or hinted), a feature-lockout can be designed so the admnin can do that on some "set" for those who are paranoid. Again for those who are paranoid, making these auditable events (I''m probably not using the right linux term) solves the need to know something like checksum-off has occured. jim -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
jim owens wrote:> As I think someone already said (or hinted), a feature-lockout > can be designed so the admnin can do that on some "set" for > those who are paranoid.That would be a nice feature. Maybe even a mount option could be available to "disable the disabling" of checksumming/COW for cases in which the admin decides users should not be changing these attributes for a particular filesystem.> Again for those who are paranoid, making these auditable events > (I''m probably not using the right linux term) solves the need > to know something like checksum-off has occured.Yes, that reminds me of ZFS''s logging facility, in which case an admin can check to see all operations ever done to the filesystem. When someone is managing a bunch of computers and/or filesystems (making it easy to get forgetful), it is always nice to be able to go back and ask the system what admin options were used as a sanity check. I think that''s a cool idea. -Joe -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Chris Mason wrote:> But, it''ll be common to mix database files with files that you do want > checksummed in the same volume. This is an admin level decision, and we > can easily provide knobs to turn it on/off. So, I think the ioctl is > really important and we''ll just document it as best we can.I do see now why this ability is desirable, especially if there is a way to see a file''s checksum/COW settings (or better still, to query the filesystem for which files have certain settings). My initial concern was that an admin or user may not realize (or forget) a file was left in an "unprotected" state, but yes, I''m sure there are ways to mitigate that possibility. -Joe -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jun 19, 2008 at 10:23:17PM -0700, Zach Brown wrote:> > > +#define BTRFS_IOC_NODATACOW _IO(BTRFS_IOCTL_MAGIC, 13) > > +#define BTRFS_IOC_DATACOW _IO(BTRFS_IOCTL_MAGIC, 14) > > +#define BTRFS_IOC_NODATASUM _IO(BTRFS_IOCTL_MAGIC, 15) > > +#define BTRFS_IOC_DATASUM _IO(BTRFS_IOCTL_MAGIC, 16) > > Hmm. Do we really want 4 different ioctl commands to turn 2 features on > and off? Surely we could have 1 ioctl which updates a bitfield? Or a > ioctl that takes an explicit feature enum argument and a boolean which > indicates that it should be enabled or not?The best interface for per-file boolean flags FS_IOC_SETFLAGS/FS_IOC_GETFLAGS ioctls which would fir this really nicely.> (And is ioctl the right interface for this? maybe it should be xattr > ops in some defined btrfs string namespace? I''m just making this up. I > feel the the lack of a single comment in the patch, while in keeping > with the existing precedent in btrfs, leaves a lot of room for wild > speculation :))Synthetic xattrs are a really utterly horrible interface. Xattrs on disk are nice and simple, but the Linux invention of making some up on the fly, starting with the Posix ACL interface makes the implementation not just utterly complicated but also confuses backup programs. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, 2008-06-21 at 03:27 -0400, Christoph Hellwig wrote:> On Thu, Jun 19, 2008 at 10:23:17PM -0700, Zach Brown wrote: > > > > > +#define BTRFS_IOC_NODATACOW _IO(BTRFS_IOCTL_MAGIC, 13) > > > +#define BTRFS_IOC_DATACOW _IO(BTRFS_IOCTL_MAGIC, 14) > > > +#define BTRFS_IOC_NODATASUM _IO(BTRFS_IOCTL_MAGIC, 15) > > > +#define BTRFS_IOC_DATASUM _IO(BTRFS_IOCTL_MAGIC, 16) > > > > Hmm. Do we really want 4 different ioctl commands to turn 2 features on > > and off? Surely we could have 1 ioctl which updates a bitfield? Or a > > ioctl that takes an explicit feature enum argument and a boolean which > > indicates that it should be enabled or not? > > The best interface for per-file boolean flags > FS_IOC_SETFLAGS/FS_IOC_GETFLAGS ioctls which would fir this really > nicely. > > > (And is ioctl the right interface for this? maybe it should be xattr > > ops in some defined btrfs string namespace? I''m just making this up. I > > feel the the lack of a single comment in the patch, while in keeping > > with the existing precedent in btrfs, leaves a lot of room for wild > > speculation :)) > > Synthetic xattrs are a really utterly horrible interface. Xattrs on > disk are nice and simple, but the Linux invention of making some up > on the fly, starting with the Posix ACL interface makes the > implementation not just utterly complicated but also confuses backup > programs.The idea is that backup programs already know how to do xattrs and can easily be changed to preserve them. Every ioctl interface we create/make up has to be handed coded into the backup program. I know xattrs are ugly, but we need to weigh the cost of the perfect interface with the availability of a common one. Dave Chinner had talked about using xattrs to control file behavior in XFS as well, not sure if that ever happened. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sunday 22 June 2008 07:10:06 Chris Mason wrote:> On Sat, 2008-06-21 at 03:27 -0400, Christoph Hellwig wrote: > > Synthetic xattrs are a really utterly horrible interface. Xattrs on > > disk are nice and simple, but the Linux invention of making some up > > on the fly, starting with the Posix ACL interface makes the > > implementation not just utterly complicated but also confuses backup > > programs. > > The idea is that backup programs already know how to do xattrs and can > easily be changed to preserve them. Every ioctl interface we > create/make up has to be handed coded into the backup program. > > I know xattrs are ugly, but we need to weigh the cost of the perfect > interface with the availability of a common one. Dave Chinner had > talked about using xattrs to control file behavior in XFS as well, not > sure if that ever happened.+1 for xattrs. From a user perspective, they''re simpler than futzing around with special btrfs programs to set/unset attributes, and as Chris points out, having backup programs pick them up is a good thing. If virtual xattrs are so complicated/messy to implement, perhaps they should be made simpler. I honestly can''t think of a better interface for something like this. -- Josh -- Joshua J. Berry "I haven''t lost my mind -- it''s backed up on tape somewhere." -- /usr/games/fortune
On Sun, Jun 22, 2008 at 10:10:06AM -0400, Chris Mason wrote:> The idea is that backup programs already know how to do xattrs and can > easily be changed to preserve them. Every ioctl interface we > create/make up has to be handed coded into the backup program.Actually they don''t. Because of the mess we''ve created with the xattr namespaces and the way e.g. ACLs are mapped into two of them they will need to special case the interesting non-user attributes. While sane backup applications already know about the inode flag word because it contains lots of useful information for the existing filesystems.> I know xattrs are ugly, but we need to weigh the cost of the perfect > interface with the availability of a common one. Dave Chinner had > talked about using xattrs to control file behavior in XFS as well, not > sure if that ever happened.I think I''ve successfully talked him out of it :) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html