Louis Rilling
2008-Jul-04 14:56 UTC
[Ocfs2-devel] [BUGFIX][PATCH v2 0/2] configfs: Fix cleanup after mkdir() failure
[ applies on top of http://lkml.org/lkml/2008/6/23/145 aka symlink() fixes ] Hi, This patchset fixes two kinds of bugs happening when configfs_attach_group()/configfs_attach_item() fail and userspace races with mkdir() or symlink(). Please read the first patch header for a detailed scenario explaining the bugs. Louis Changelog: - Few code reworks as requested by Joel (details in patch headers) Summary (2): configfs: Prevent userspace from creating new entries under attaching directories configfs: Lock new directory inodes before removing on cleanup after failure fs/configfs/configfs_internal.h | 2 + fs/configfs/dir.c | 141 ++++++++++++++++++++++++++++++++------- fs/configfs/symlink.c | 15 ++++ 3 files changed, 133 insertions(+), 25 deletions(-) -- Dr Louis Rilling Kerlabs Skype: louis.rilling Batiment Germanium Phone: (+33|0) 6 80 89 08 23 80 avenue des Buttes de Coesmes http://www.kerlabs.com/ 35700 Rennes
Louis Rilling
2008-Jul-04 14:56 UTC
[Ocfs2-devel] [BUGFIX][PATCH v2 1/2] configfs: Prevent userspace from creating new entries under attaching directories
process 1: process 2: configfs_mkdir("A") attach_group("A") attach_item("A") d_instantiate("A") populate_groups("A") mutex_lock("A") attach_group("A/B") attach_item("A") d_instantiate("A/B") mkdir("A/B/C") do_path_lookup("A/B/C", LOOKUP_PARENT) ok lookup_create("A/B/C") mutex_lock("A/B") ok configfs_mkdir("A/B/C") ok attach_group("A/C") attach_item("A/C") d_instantiate("A/C") populate_groups("A/C") mutex_lock("A/C") attach_group("A/C/D") attach_item("A/C/D") failure mutex_unlock("A/C") detach_groups("A/C") nothing to do mkdir("A/C/E") do_path_lookup("A/C/E", LOOKUP_PARENT) ok lookup_create("A/C/E") mutex_lock("A/C") ok configfs_mkdir("A/C/E") ok detach_item("A/C") d_delete("A/C") mutex_unlock("A") detach_groups("A") mutex_lock("A/B") detach_group("A/B") detach_groups("A/B") nothing since no _default_ group detach_item("A/B") mutex_unlock("A/B") d_delete("A/B") detach_item("A") d_delete("A") Two bugs: 1/ "A/B/C" and "A/C/E" are created, but never removed while their parent are removed in the end. The same could happen with symlink() instead of mkdir(). 2/ "A" and "A/C" inodes are not locked while detach_item() is called on them, which may probably confuse VFS. This commit fixes 1/, tagging new directories with CONFIGFS_USET_CREATING before building the inode and instantiating the dentry, and validating the whole group+default groups hierarchy in a second pass by clearing CONFIGFS_USET_CREATING. mkdir(), symlink(), lookup(), and dir_open() simply return -ENOENT if called in (or linking to) a directory tagged with CONFIGFS_USET_CREATING. This does not prevent userspace from calling stat() successfuly on such directories, but this prevents userspace from adding (children to | symlinking from/to | read/write attributes of | listing the contents of) not validated items. In other words, userspace will not interact with the subsystem on a new item until the new item creation completes correctly. It was first proposed to re-use CONFIGFS_USET_IN_MKDIR instead of a new flag CONFIGFS_USET_CREATING, but this generated conflicts when checking the target of a new symlink: a valid target directory in the middle of attaching a new user-created child item could be wrongly detected as being attached. 2/ is fixed by next commit. Changelog: - Rename configfs_validate_dir() in configfs_dir_set_ready() - Introduce configfs_dirent_is_ready() helper to unbloat a bit CONFIGFS_USET_CREATING checks Signed-off-by: Louis Rilling <louis.rilling at kerlabs.com> --- fs/configfs/configfs_internal.h | 2 + fs/configfs/dir.c | 93 ++++++++++++++++++++++++++++++++++++--- fs/configfs/symlink.c | 15 ++++++ 3 files changed, 104 insertions(+), 6 deletions(-) diff --git a/fs/configfs/configfs_internal.h b/fs/configfs/configfs_internal.h index 5f61b26..762d287 100644 --- a/fs/configfs/configfs_internal.h +++ b/fs/configfs/configfs_internal.h @@ -49,6 +49,7 @@ struct configfs_dirent { #define CONFIGFS_USET_DEFAULT 0x0080 #define CONFIGFS_USET_DROPPING 0x0100 #define CONFIGFS_USET_IN_MKDIR 0x0200 +#define CONFIGFS_USET_CREATING 0x0400 #define CONFIGFS_NOT_PINNED (CONFIGFS_ITEM_ATTR) extern struct mutex configfs_symlink_mutex; @@ -67,6 +68,7 @@ extern void configfs_inode_exit(void); extern int configfs_create_file(struct config_item *, const struct configfs_attribute *); extern int configfs_make_dirent(struct configfs_dirent *, struct dentry *, void *, umode_t, int); +extern int configfs_dirent_is_ready(struct configfs_dirent *); extern int configfs_add_file(struct dentry *, const struct configfs_attribute *, int); extern void configfs_hash_and_remove(struct dentry * dir, const char * name); diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c index 2c873fd..8dd99a2 100644 --- a/fs/configfs/dir.c +++ b/fs/configfs/dir.c @@ -185,7 +185,7 @@ static int create_dir(struct config_item * k, struct dentry * p, error = configfs_dirent_exists(p->d_fsdata, d->d_name.name); if (!error) error = configfs_make_dirent(p->d_fsdata, d, k, mode, - CONFIGFS_DIR); + CONFIGFS_DIR | CONFIGFS_USET_CREATING); if (!error) { error = configfs_create(d, mode, init_dir); if (!error) { @@ -209,6 +209,9 @@ static int create_dir(struct config_item * k, struct dentry * p, * configfs_create_dir - create a directory for an config_item. * @item: config_itemwe're creating directory for. * @dentry: config_item's dentry. + * + * Note: user-created entries won't be allowed under this new directory + * until it is validated by configfs_dir_set_ready() */ static int configfs_create_dir(struct config_item * item, struct dentry *dentry) @@ -231,6 +234,44 @@ static int configfs_create_dir(struct config_item * item, struct dentry *dentry) return error; } +/* + * Allow userspace to create new entries under a new directory created with + * configfs_create_dir(), and under all of its chidlren directories recursively. + * @sd configfs_dirent of the new directory to validate + * + * Caller must hold configfs_dirent_lock. + */ +static void configfs_dir_set_ready(struct configfs_dirent *sd) +{ + struct configfs_dirent *child_sd; + + sd->s_type &= ~CONFIGFS_USET_CREATING; + list_for_each_entry(child_sd, &sd->s_children, s_sibling) + if (child_sd->s_type & CONFIGFS_USET_CREATING) + configfs_dir_set_ready(child_sd); +} + +/* + * Check that a directory does not belong to a directory hierarchy being + * attached and not validated yet. + * @sd configfs_dirent of the directory to check + * + * @return non-zero iff the directory was validated + * + * Note: takes configfs_dirent_lock, so the result may change from false to true + * in two consecutive calls, but never from true to false. + */ +int configfs_dirent_is_ready(struct configfs_dirent *sd) +{ + int ret; + + spin_lock(&configfs_dirent_lock); + ret = !(sd->s_type & CONFIGFS_USET_CREATING); + spin_unlock(&configfs_dirent_lock); + + return ret; +} + int configfs_create_link(struct configfs_symlink *sl, struct dentry *parent, struct dentry *dentry) @@ -330,7 +371,19 @@ static struct dentry * configfs_lookup(struct inode *dir, struct configfs_dirent * parent_sd = dentry->d_parent->d_fsdata; struct configfs_dirent * sd; int found = 0; - int err = 0; + int err; + + /* + * Fake invisibility if dir belongs to a group/default groups hierarchy + * being attached + * + * This forbids userspace to read/write attributes of items which may + * not complete their initialization, since the dentries of the + * attributes won't be instantiated. + */ + err = -ENOENT; + if (!configfs_dirent_is_ready(parent_sd)) + goto out; list_for_each_entry(sd, &parent_sd->s_children, s_sibling) { if (sd->s_type & CONFIGFS_NOT_PINNED) { @@ -353,6 +406,7 @@ static struct dentry * configfs_lookup(struct inode *dir, return simple_lookup(dir, dentry, nd); } +out: return ERR_PTR(err); } @@ -1027,7 +1081,7 @@ EXPORT_SYMBOL(configfs_undepend_item); static int configfs_mkdir(struct inode *dir, struct dentry *dentry, int mode) { - int ret, module_got = 0; + int ret = 0, module_got = 0; struct config_group *group; struct config_item *item; struct config_item *parent_item; @@ -1043,6 +1097,16 @@ static int configfs_mkdir(struct inode *dir, struct dentry *dentry, int mode) } sd = dentry->d_parent->d_fsdata; + + /* + * Fake invisibility if dir belongs to a group/default groups hierarchy + * being attached + */ + if (!configfs_dirent_is_ready(sd)) { + ret = -ENOENT; + goto out; + } + if (!(sd->s_type & CONFIGFS_USET_DIR)) { ret = -EPERM; goto out; @@ -1137,6 +1201,8 @@ static int configfs_mkdir(struct inode *dir, struct dentry *dentry, int mode) spin_lock(&configfs_dirent_lock); sd->s_type &= ~CONFIGFS_USET_IN_MKDIR; + if (!ret) + configfs_dir_set_ready(dentry->d_fsdata); spin_unlock(&configfs_dirent_lock); out_unlink: @@ -1317,13 +1383,24 @@ static int configfs_dir_open(struct inode *inode, struct file *file) { struct dentry * dentry = file->f_path.dentry; struct configfs_dirent * parent_sd = dentry->d_fsdata; + int err; mutex_lock(&dentry->d_inode->i_mutex); - file->private_data = configfs_new_dirent(parent_sd, NULL); + /* + * Fake invisibility if dir belongs to a group/default groups hierarchy + * being attached + */ + err = -ENOENT; + if (configfs_dirent_is_ready(parent_sd)) { + file->private_data = configfs_new_dirent(parent_sd, NULL); + if (IS_ERR(file->private_data)) + err = PTR_ERR(file->private_data); + else + err = 0; + } mutex_unlock(&dentry->d_inode->i_mutex); - return IS_ERR(file->private_data) ? PTR_ERR(file->private_data) : 0; - + return err; } static int configfs_dir_close(struct inode *inode, struct file *file) @@ -1494,6 +1571,10 @@ int configfs_register_subsystem(struct configfs_subsystem *subsys) if (err) { d_delete(dentry); dput(dentry); + } else { + spin_lock(&configfs_dirent_lock); + configfs_dir_set_ready(dentry->d_fsdata); + spin_unlock(&configfs_dirent_lock); } } diff --git a/fs/configfs/symlink.c b/fs/configfs/symlink.c index 21bd751..e29a77e 100644 --- a/fs/configfs/symlink.c +++ b/fs/configfs/symlink.c @@ -76,6 +76,9 @@ static int create_link(struct config_item *parent_item, struct configfs_symlink *sl; int ret; + ret = -ENOENT; + if (!configfs_dirent_is_ready(target_sd)) + goto out; ret = -ENOMEM; sl = kmalloc(sizeof(struct configfs_symlink), GFP_KERNEL); if (sl) { @@ -100,6 +103,7 @@ static int create_link(struct config_item *parent_item, } } +out: return ret; } @@ -129,6 +133,7 @@ int configfs_symlink(struct inode *dir, struct dentry *dentry, const char *symna { int ret; struct nameidata nd; + struct configfs_dirent *sd; struct config_item *parent_item; struct config_item *target_item; struct config_item_type *type; @@ -137,9 +142,19 @@ int configfs_symlink(struct inode *dir, struct dentry *dentry, const char *symna if (dentry->d_parent == configfs_sb->s_root) goto out; + sd = dentry->d_parent->d_fsdata; + /* + * Fake invisibility if dir belongs to a group/default groups hierarchy + * being attached + */ + ret = -ENOENT; + if (!configfs_dirent_is_ready(sd)) + goto out; + parent_item = configfs_get_config_item(dentry->d_parent); type = parent_item->ci_type; + ret = -EPERM; if (!type || !type->ct_item_ops || !type->ct_item_ops->allow_link) goto out_put; -- 1.5.5.3
Louis Rilling
2008-Jul-04 14:56 UTC
[Ocfs2-devel] [BUGFIX][PATCH v2 2/2] configfs: Lock new directory inodes before removing on cleanup after failure
Once a new configfs directory is created by configfs_attach_item() or configfs_attach_group(), a failure in the remaining initialization steps leads to removing a directory which inode the VFS may have already accessed. This commit adds the necessary inode locking to safely remove configfs directories while cleaning up after a failure. As an advantage, the locking rules of populate_groups() and detach_groups() become the same: the caller must have the group's inode mutex locked. Changelog: - Improve comments and braces as requested by Joel Signed-off-by: Louis Rilling <louis.rilling at kerlabs.com> --- fs/configfs/dir.c | 48 +++++++++++++++++++++++++++++------------------- 1 files changed, 29 insertions(+), 19 deletions(-) diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c index 8dd99a2..4252278 100644 --- a/fs/configfs/dir.c +++ b/fs/configfs/dir.c @@ -324,6 +324,8 @@ static void remove_dir(struct dentry * d) * The only thing special about this is that we remove any files in * the directory before we remove the directory, and we've inlined * what used to be configfs_rmdir() below, instead of calling separately. + * + * Caller holds the mutex of the item's inode */ static void configfs_remove_dir(struct config_item * item) @@ -612,36 +614,21 @@ static int create_default_group(struct config_group *parent_group, static int populate_groups(struct config_group *group) { struct config_group *new_group; - struct dentry *dentry = group->cg_item.ci_dentry; int ret = 0; int i; if (group->default_groups) { - /* - * FYI, we're faking mkdir here - * I'm not sure we need this semaphore, as we're called - * from our parent's mkdir. That holds our parent's - * i_mutex, so afaik lookup cannot continue through our - * parent to find us, let alone mess with our tree. - * That said, taking our i_mutex is closer to mkdir - * emulation, and shouldn't hurt. - */ - mutex_lock_nested(&dentry->d_inode->i_mutex, I_MUTEX_CHILD); - for (i = 0; group->default_groups[i]; i++) { new_group = group->default_groups[i]; ret = create_default_group(group, new_group); - if (ret) + if (ret) { + detach_groups(group); break; + } } - - mutex_unlock(&dentry->d_inode->i_mutex); } - if (ret) - detach_groups(group); - return ret; } @@ -756,7 +743,15 @@ static int configfs_attach_item(struct config_item *parent_item, if (!ret) { ret = populate_attrs(item); if (ret) { + /* + * We are going to remove an inode and its dentry but + * the VFS may already have hit and used them. Thus, + * we must lock them as rmdir() would. + */ + mutex_lock(&dentry->d_inode->i_mutex); configfs_remove_dir(item); + dentry->d_inode->i_flags |= S_DEAD; + mutex_unlock(&dentry->d_inode->i_mutex); d_delete(dentry); } } @@ -764,6 +759,7 @@ static int configfs_attach_item(struct config_item *parent_item, return ret; } +/* Caller holds the mutex of the item's inode */ static void configfs_detach_item(struct config_item *item) { detach_attrs(item); @@ -782,16 +778,30 @@ static int configfs_attach_group(struct config_item *parent_item, sd = dentry->d_fsdata; sd->s_type |= CONFIGFS_USET_DIR; + /* + * FYI, we're faking mkdir in populate_groups() + * We must lock the group's inode to avoid races with the VFS + * which can already hit the inode and try to add/remove entries + * under it. + * + * We must also lock the inode to remove it safely in case of + * error, as rmdir() would. + */ + mutex_lock_nested(&dentry->d_inode->i_mutex, I_MUTEX_CHILD); ret = populate_groups(to_config_group(item)); if (ret) { configfs_detach_item(item); - d_delete(dentry); + dentry->d_inode->i_flags |= S_DEAD; } + mutex_unlock(&dentry->d_inode->i_mutex); + if (ret) + d_delete(dentry); } return ret; } +/* Caller holds the mutex of the group's inode */ static void configfs_detach_group(struct config_item *item) { detach_groups(to_config_group(item)); -- 1.5.5.3
Louis Rilling
2008-Jul-04 16:32 UTC
[Ocfs2-devel] [BUGFIX][PATCH v2 0/2] configfs: Fix cleanup after mkdir() failure
Joel, Sorry for posting to a wrong address (really fun to see you working at Kerlabs ;)). I've just discovered that git-send-email is case-sensitive with email aliases while mutt is not... Cheers, Louis On Fri, Jul 04, 2008 at 04:56:04PM +0200, Louis Rilling wrote:> [ applies on top of http://lkml.org/lkml/2008/6/23/145 aka symlink() fixes ] > > Hi, > > This patchset fixes two kinds of bugs happening when > configfs_attach_group()/configfs_attach_item() fail and userspace races with > mkdir() or symlink(). > > Please read the first patch header for a detailed scenario explaining the bugs. > > Louis > > Changelog: > - Few code reworks as requested by Joel (details in patch headers) > > Summary (2): > configfs: Prevent userspace from creating new entries under attaching > directories > configfs: Lock new directory inodes before removing on cleanup after > failure > > fs/configfs/configfs_internal.h | 2 + > fs/configfs/dir.c | 141 ++++++++++++++++++++++++++++++++------- > fs/configfs/symlink.c | 15 ++++ > 3 files changed, 133 insertions(+), 25 deletions(-) > > -- > Dr Louis Rilling Kerlabs > Skype: louis.rilling Batiment Germanium > Phone: (+33|0) 6 80 89 08 23 80 avenue des Buttes de Coesmes > http://www.kerlabs.com/ 35700 Rennes > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/-- Dr Louis Rilling Kerlabs Skype: louis.rilling Batiment Germanium Phone: (+33|0) 6 80 89 08 23 80 avenue des Buttes de Coesmes http://www.kerlabs.com/ 35700 Rennes -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20080704/ff9e1f00/attachment.bin
Joel Becker
2008-Jul-11 20:05 UTC
[Ocfs2-devel] [BUGFIX][PATCH v2 2/2] configfs: Lock new directory inodes before removing on cleanup after failure
On Fri, Jul 04, 2008 at 04:56:06PM +0200, Louis Rilling wrote:> Changelog: > - Improve comments and braces as requested by JoelAlso looking good, I want to test them and then send them along. Joel -- "When I am working on a problem I never think about beauty. I only think about how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." - Buckminster Fuller Joel Becker Principal Software Developer Oracle E-mail: joel.becker at oracle.com Phone: (650) 506-8127
Joel Becker
2008-Jul-11 22:07 UTC
[Ocfs2-devel] [BUGFIX][PATCH v2 0/2] configfs: Fix cleanup after mkdir() failure
On Fri, Jul 04, 2008 at 04:56:04PM +0200, Louis Rilling wrote:> This patchset fixes two kinds of bugs happening when > configfs_attach_group()/configfs_attach_item() fail and userspace races with > mkdir() or symlink().Ok, I've merged all this work. Please check out my configfs-ALL branch and make sure I have everything we've worked on. http://oss.oracle.com/git/?p=jlbec/linux-2.6.git;a=shortlog;h=configfs-ALL git://oss.oracle.com/git/jlbec/linux-2.6.git configfs-ALL Joel -- Life's Little Instruction Book #335 "Every so often, push your luck." Joel Becker Principal Software Developer Oracle E-mail: joel.becker at oracle.com Phone: (650) 506-8127