Louis Rilling
2008-Jun-26 18:05 UTC
[Ocfs2-devel] [BUGFIX][PATCH 0/2] configfs: Fix cleanup after mkdir() failure
[ applies on top of http://lkml.org/lkml/2008/6/23/145 aka symlink() fixes ] Hi, This patchset fixes two kinds of bugs happening when configfs_attach_group()/configfs_attach_item() fail and userspace races with mkdir() or symlink(). Please read the first patch header for a detailed scenario explaining the bugs. Louis Summary (2): configfs: Prevent userspace from creating new entries under attaching directories configfs: Lock new directory inodes before removing on cleanup after failure fs/configfs/configfs_internal.h | 1 + fs/configfs/dir.c | 127 +++++++++++++++++++++++++++++++-------- fs/configfs/symlink.c | 17 +++++- 3 files changed, 118 insertions(+), 27 deletions(-) -- Dr Louis Rilling Kerlabs Skype: louis.rilling Batiment Germanium Phone: (+33|0) 6 80 89 08 23 80 avenue des Buttes de Coesmes http://www.kerlabs.com/ 35700 Rennes
Louis Rilling
2008-Jun-26 18:05 UTC
[Ocfs2-devel] [BUGFIX][PATCH 1/2] configfs: Prevent userspace from creating new entries under attaching directories
process 1: process 2: configfs_mkdir("A") attach_group("A") attach_item("A") d_instantiate("A") populate_groups("A") mutex_lock("A") attach_group("A/B") attach_item("A") d_instantiate("A/B") mkdir("A/B/C") do_path_lookup("A/B/C", LOOKUP_PARENT) ok lookup_create("A/B/C") mutex_lock("A/B") ok configfs_mkdir("A/B/C") ok attach_group("A/C") attach_item("A/C") d_instantiate("A/C") populate_groups("A/C") mutex_lock("A/C") attach_group("A/C/D") attach_item("A/C/D") failure mutex_unlock("A/C") detach_groups("A/C") nothing to do mkdir("A/C/E") do_path_lookup("A/C/E", LOOKUP_PARENT) ok lookup_create("A/C/E") mutex_lock("A/C") ok configfs_mkdir("A/C/E") ok detach_item("A/C") d_delete("A/C") mutex_unlock("A") detach_groups("A") mutex_lock("A/B") detach_group("A/B") detach_groups("A/B") nothing since no _default_ group detach_item("A/B") mutex_unlock("A/B") d_delete("A/B") detach_item("A") d_delete("A") Two bugs: 1/ "A/B/C" and "A/C/E" are created, but never removed while their parent are removed in the end. The same could happen with symlink() instead of mkdir(). 2/ "A" and "A/C" inodes are not locked while detach_item() is called on them, which may probably confuse VFS. This commit fixes 1/, tagging new directories with CONFIGFS_USET_CREATING before building the inode and instantiating the dentry, and validating the whole group+default groups hierarchy in a second pass by clearing CONFIGFS_USET_CREATING. mkdir(), symlink(), lookup(), and dir_open() simply return -ENOENT if called in (or linking to) a directory tagged with CONFIGFS_USET_CREATING. This does not prevent userspace from calling stat() successfuly on such directories, but this prevents userspace from adding (children to | symlinking from/to | read/write attributes of | listing the contents of) not validated items. In other words, userspace will not interact with the subsystem on a new item until the new item creation completes correctly. It was first proposed to re-use CONFIGFS_USET_IN_MKDIR instead of a new flag CONFIGFS_USET_CREATING, but this generated conflicts when checking the target of a new symlink: a valid target directory in the middle of attaching a new user-created child item could be wrongly detected as being attached. 2/ is fixed by next commit. Signed-off-by: Louis Rilling <louis.rilling at kerlabs.com> --- fs/configfs/configfs_internal.h | 1 + fs/configfs/dir.c | 77 ++++++++++++++++++++++++++++++++++++--- fs/configfs/symlink.c | 17 ++++++++- 3 files changed, 89 insertions(+), 6 deletions(-) diff --git a/fs/configfs/configfs_internal.h b/fs/configfs/configfs_internal.h index 5f61b26..51d76bf 100644 --- a/fs/configfs/configfs_internal.h +++ b/fs/configfs/configfs_internal.h @@ -49,6 +49,7 @@ struct configfs_dirent { #define CONFIGFS_USET_DEFAULT 0x0080 #define CONFIGFS_USET_DROPPING 0x0100 #define CONFIGFS_USET_IN_MKDIR 0x0200 +#define CONFIGFS_USET_CREATING 0x0400 #define CONFIGFS_NOT_PINNED (CONFIGFS_ITEM_ATTR) extern struct mutex configfs_symlink_mutex; diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c index 2c873fd..bfd2620 100644 --- a/fs/configfs/dir.c +++ b/fs/configfs/dir.c @@ -185,7 +185,7 @@ static int create_dir(struct config_item * k, struct dentry * p, error = configfs_dirent_exists(p->d_fsdata, d->d_name.name); if (!error) error = configfs_make_dirent(p->d_fsdata, d, k, mode, - CONFIGFS_DIR); + CONFIGFS_DIR | CONFIGFS_USET_CREATING); if (!error) { error = configfs_create(d, mode, init_dir); if (!error) { @@ -209,6 +209,9 @@ static int create_dir(struct config_item * k, struct dentry * p, * configfs_create_dir - create a directory for an config_item. * @item: config_itemwe're creating directory for. * @dentry: config_item's dentry. + * + * Note: user-created entries won't be allowed under this new directory + * until it is validated by configfs_validate_dir() */ static int configfs_create_dir(struct config_item * item, struct dentry *dentry) @@ -231,6 +234,23 @@ static int configfs_create_dir(struct config_item * item, struct dentry *dentry) return error; } +/* + * Allow userspace to create new entries under a new directory created with + * configfs_create_dir(), and under all of its chidlren directories recursively. + * @sd configfs_dirent of the new directory to validate + * + * Caller must hold configfs_dirent_lock. + */ +static void configfs_validate_dir(struct configfs_dirent *sd) +{ + struct configfs_dirent *child_sd; + + sd->s_type &= ~CONFIGFS_USET_CREATING; + list_for_each_entry(child_sd, &sd->s_children, s_sibling) + if (child_sd->s_type & CONFIGFS_USET_CREATING) + configfs_validate_dir(child_sd); +} + int configfs_create_link(struct configfs_symlink *sl, struct dentry *parent, struct dentry *dentry) @@ -332,6 +352,21 @@ static struct dentry * configfs_lookup(struct inode *dir, int found = 0; int err = 0; + /* + * Fake invisibility if dir belongs to a group/default groups hierarchy + * being attached + * + * This forbids userspace to read/write attributes of items which may + * not complete their initialization, since the dentries of the + * attributes won't be instantiated. + */ + spin_lock(&configfs_dirent_lock); + if (parent_sd->s_type & CONFIGFS_USET_CREATING) + err = -ENOENT; + spin_unlock(&configfs_dirent_lock); + if (err) + goto out; + list_for_each_entry(sd, &parent_sd->s_children, s_sibling) { if (sd->s_type & CONFIGFS_NOT_PINNED) { const unsigned char * name = configfs_get_name(sd); @@ -353,6 +388,7 @@ static struct dentry * configfs_lookup(struct inode *dir, return simple_lookup(dir, dentry, nd); } +out: return ERR_PTR(err); } @@ -1027,7 +1063,7 @@ EXPORT_SYMBOL(configfs_undepend_item); static int configfs_mkdir(struct inode *dir, struct dentry *dentry, int mode) { - int ret, module_got = 0; + int ret = 0, module_got = 0; struct config_group *group; struct config_item *item; struct config_item *parent_item; @@ -1043,6 +1079,18 @@ static int configfs_mkdir(struct inode *dir, struct dentry *dentry, int mode) } sd = dentry->d_parent->d_fsdata; + + /* + * Fake invisibility if dir belongs to a group/default groups hierarchy + * being attached + */ + spin_lock(&configfs_dirent_lock); + if (sd->s_type & CONFIGFS_USET_CREATING) + ret = -ENOENT; + spin_unlock(&configfs_dirent_lock); + if (ret) + goto out; + if (!(sd->s_type & CONFIGFS_USET_DIR)) { ret = -EPERM; goto out; @@ -1137,6 +1185,8 @@ static int configfs_mkdir(struct inode *dir, struct dentry *dentry, int mode) spin_lock(&configfs_dirent_lock); sd->s_type &= ~CONFIGFS_USET_IN_MKDIR; + if (!ret) + configfs_validate_dir(dentry->d_fsdata); spin_unlock(&configfs_dirent_lock); out_unlink: @@ -1317,13 +1367,26 @@ static int configfs_dir_open(struct inode *inode, struct file *file) { struct dentry * dentry = file->f_path.dentry; struct configfs_dirent * parent_sd = dentry->d_fsdata; + int err = 0; mutex_lock(&dentry->d_inode->i_mutex); - file->private_data = configfs_new_dirent(parent_sd, NULL); - mutex_unlock(&dentry->d_inode->i_mutex); + /* + * Fake invisibility if dir belongs to a group/default groups hierarchy + * being attached + */ + spin_lock(&configfs_dirent_lock); + if (parent_sd->s_type & CONFIGFS_USET_CREATING) + err = -ENOENT; + spin_unlock(&configfs_dirent_lock); - return IS_ERR(file->private_data) ? PTR_ERR(file->private_data) : 0; + if (!err) { + file->private_data = configfs_new_dirent(parent_sd, NULL); + if (IS_ERR(file->private_data)) + err = PTR_ERR(file->private_data); + } + mutex_unlock(&dentry->d_inode->i_mutex); + return err; } static int configfs_dir_close(struct inode *inode, struct file *file) @@ -1494,6 +1557,10 @@ int configfs_register_subsystem(struct configfs_subsystem *subsys) if (err) { d_delete(dentry); dput(dentry); + } else { + spin_lock(&configfs_dirent_lock); + configfs_validate_dir(dentry->d_fsdata); + spin_unlock(&configfs_dirent_lock); } } diff --git a/fs/configfs/symlink.c b/fs/configfs/symlink.c index 21bd751..65623fb 100644 --- a/fs/configfs/symlink.c +++ b/fs/configfs/symlink.c @@ -81,7 +81,7 @@ static int create_link(struct config_item *parent_item, if (sl) { sl->sl_target = config_item_get(item); spin_lock(&configfs_dirent_lock); - if (target_sd->s_type & CONFIGFS_USET_DROPPING) { + if (target_sd->s_type & (CONFIGFS_USET_CREATING | CONFIGFS_USET_DROPPING)) { spin_unlock(&configfs_dirent_lock); config_item_put(item); kfree(sl); @@ -129,6 +129,7 @@ int configfs_symlink(struct inode *dir, struct dentry *dentry, const char *symna { int ret; struct nameidata nd; + struct configfs_dirent *sd; struct config_item *parent_item; struct config_item *target_item; struct config_item_type *type; @@ -137,9 +138,23 @@ int configfs_symlink(struct inode *dir, struct dentry *dentry, const char *symna if (dentry->d_parent == configfs_sb->s_root) goto out; + sd = dentry->d_parent->d_fsdata; + /* + * Fake invisibility if dir belongs to a group/default groups hierarchy + * being attached + */ + ret = 0; + spin_lock(&configfs_dirent_lock); + if (sd->s_type & CONFIGFS_USET_CREATING) + ret = -ENOENT; + spin_unlock(&configfs_dirent_lock); + if (ret) + goto out; + parent_item = configfs_get_config_item(dentry->d_parent); type = parent_item->ci_type; + ret = -EPERM; if (!type || !type->ct_item_ops || !type->ct_item_ops->allow_link) goto out_put; -- 1.5.5.3
Louis Rilling
2008-Jun-26 18:05 UTC
[Ocfs2-devel] [BUGFIX][PATCH 2/2] configfs: Lock new directory inodes before removing on cleanup after failure
Once a new configfs directory is created by configfs_attach_item() or configfs_attach_group(), a failure in the remaining initialization steps leads to removing a directory which inode the VFS may have already accessed. This commit adds the necessary inode locking to safely remove configfs directories while cleaning up after a failure. As an advantage, the locking rules of populate_groups() and detach_groups() become the same: the caller must have the group's inode mutex locked. Signed-off-by: Louis Rilling <louis.rilling at kerlabs.com> --- fs/configfs/dir.c | 50 +++++++++++++++++++++++++++++--------------------- 1 files changed, 29 insertions(+), 21 deletions(-) diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c index bfd2620..c11bc1b 100644 --- a/fs/configfs/dir.c +++ b/fs/configfs/dir.c @@ -303,6 +303,8 @@ static void remove_dir(struct dentry * d) * The only thing special about this is that we remove any files in * the directory before we remove the directory, and we've inlined * what used to be configfs_rmdir() below, instead of calling separately. + * + * Caller holds the mutex of the item's inode */ static void configfs_remove_dir(struct config_item * item) @@ -594,36 +596,20 @@ static int create_default_group(struct config_group *parent_group, static int populate_groups(struct config_group *group) { struct config_group *new_group; - struct dentry *dentry = group->cg_item.ci_dentry; int ret = 0; int i; - if (group->default_groups) { - /* - * FYI, we're faking mkdir here - * I'm not sure we need this semaphore, as we're called - * from our parent's mkdir. That holds our parent's - * i_mutex, so afaik lookup cannot continue through our - * parent to find us, let alone mess with our tree. - * That said, taking our i_mutex is closer to mkdir - * emulation, and shouldn't hurt. - */ - mutex_lock_nested(&dentry->d_inode->i_mutex, I_MUTEX_CHILD); - + if (group->default_groups) for (i = 0; group->default_groups[i]; i++) { new_group = group->default_groups[i]; ret = create_default_group(group, new_group); - if (ret) + if (ret) { + detach_groups(group); break; + } } - mutex_unlock(&dentry->d_inode->i_mutex); - } - - if (ret) - detach_groups(group); - return ret; } @@ -738,7 +724,14 @@ static int configfs_attach_item(struct config_item *parent_item, if (!ret) { ret = populate_attrs(item); if (ret) { + /* + * We are going to remove an inode and its dentry but + * the VFS may already have hit and used them. + */ + mutex_lock(&dentry->d_inode->i_mutex); configfs_remove_dir(item); + dentry->d_inode->i_flags |= S_DEAD; + mutex_unlock(&dentry->d_inode->i_mutex); d_delete(dentry); } } @@ -746,6 +739,7 @@ static int configfs_attach_item(struct config_item *parent_item, return ret; } +/* Caller holds the mutex of the item's inode */ static void configfs_detach_item(struct config_item *item) { detach_attrs(item); @@ -764,16 +758,30 @@ static int configfs_attach_group(struct config_item *parent_item, sd = dentry->d_fsdata; sd->s_type |= CONFIGFS_USET_DIR; + /* + * FYI, we're faking mkdir in populate_groups() + * We must lock the group's inode to avoid races with the VFS + * which can already hit the inode and try to add/remove entries + * under it. + * + * We must also lock the inode to remove it safely in case of + * error. + */ + mutex_lock_nested(&dentry->d_inode->i_mutex, I_MUTEX_CHILD); ret = populate_groups(to_config_group(item)); if (ret) { configfs_detach_item(item); - d_delete(dentry); + dentry->d_inode->i_flags |= S_DEAD; } + mutex_unlock(&dentry->d_inode->i_mutex); + if (ret) + d_delete(dentry); } return ret; } +/* Caller holds the mutex of the group's inode */ static void configfs_detach_group(struct config_item *item) { detach_groups(to_config_group(item)); -- 1.5.5.3
Joel Becker
2008-Jul-03 21:58 UTC
[Ocfs2-devel] [BUGFIX][PATCH 1/2] configfs: Prevent userspace from creating new entries under attaching directories
On Thu, Jun 26, 2008 at 08:05:48PM +0200, Louis Rilling wrote:> This commit fixes 1/, tagging new directories with CONFIGFS_USET_CREATING before > building the inode and instantiating the dentry, and validating the whole > group+default groups hierarchy in a second pass by clearing > CONFIGFS_USET_CREATING.Man, I'm wary of all these in-flight flags. I hope they are all orthogonal :-)> mkdir(), symlink(), lookup(), and dir_open() simply return -ENOENT if > called in (or linking to) a directory tagged with CONFIGFS_USET_CREATING. ThisWhy not block until the create is done?> + /* > + * Fake invisibility if dir belongs to a group/default groups hierarchy > + * being attached > + * > + * This forbids userspace to read/write attributes of items which may > + * not complete their initialization, since the dentries of the > + * attributes won't be instantiated. > + */int configfs_dirent_is_ready(struct configfs_dirent *sd) { int err = 0;> + spin_lock(&configfs_dirent_lock); > + if (parent_sd->s_type & CONFIGFS_USET_CREATING) > + err = -ENOENT; > + spin_unlock(&configfs_dirent_lock);return err; } Then use is_ready() in the five places you check it ;-) Perhaps change configfs_validate_dir() to configfs_dir_set_ready(). I do like the way validate_dir() is coded. Joel -- "There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle." - Albert Einstein Joel Becker Principal Software Developer Oracle E-mail: joel.becker at oracle.com Phone: (650) 506-8127
Joel Becker
2008-Jul-03 22:06 UTC
[Ocfs2-devel] [BUGFIX][PATCH 2/2] configfs: Lock new directory inodes before removing on cleanup after failure
On Thu, Jun 26, 2008 at 08:05:49PM +0200, Louis Rilling wrote:> Once a new configfs directory is created by configfs_attach_item() or > configfs_attach_group(), a failure in the remaining initialization steps leads > to removing a directory which inode the VFS may have already accessed. > > This commit adds the necessary inode locking to safely remove configfs > directories while cleaning up after a failure. As an advantage, the locking > rules of populate_groups() and detach_groups() become the same: the caller must > have the group's inode mutex locked.I like this latter symmetry.> static void configfs_remove_dir(struct config_item * item) > @@ -594,36 +596,20 @@ static int create_default_group(struct config_group *parent_group, > static int populate_groups(struct config_group *group) > { > struct config_group *new_group; > - struct dentry *dentry = group->cg_item.ci_dentry; > int ret = 0; > int i; > > - if (group->default_groups) { > - /* > - * FYI, we're faking mkdir here > - * I'm not sure we need this semaphore, as we're called > - * from our parent's mkdir. That holds our parent's > - * i_mutex, so afaik lookup cannot continue through our > - * parent to find us, let alone mess with our tree. > - * That said, taking our i_mutex is closer to mkdir > - * emulation, and shouldn't hurt. > - */ > - mutex_lock_nested(&dentry->d_inode->i_mutex, I_MUTEX_CHILD); > - > + if (group->default_groups)Put the {} around this if. It may only have the for() as a child, but it's 10 lines.> for (i = 0; group->default_groups[i]; i++) { > new_group = group->default_groups[i]; > > ret = create_default_group(group, new_group); > - if (ret) > + if (ret) { > + detach_groups(group); > break; > + } > } ><snip>> @@ -738,7 +724,14 @@ static int configfs_attach_item(struct config_item *parent_item, > if (!ret) { > ret = populate_attrs(item); > if (ret) { > + /* > + * We are going to remove an inode and its dentry but > + * the VFS may already have hit and used them. > + */ > + mutex_lock(&dentry->d_inode->i_mutex); > configfs_remove_dir(item); > + dentry->d_inode->i_flags |= S_DEAD; > + mutex_unlock(&dentry->d_inode->i_mutex);This emulates how rmdir() would lock it, which is quite nice. Perhaps add to the comment "thus, we must lock them as rmdir() would". Joel -- "There is shadow under this red rock. (Come in under the shadow of this red rock) And I will show you something different from either Your shadow at morning striding behind you Or your shadow at evening rising to meet you. I will show you fear in a handful of dust." Joel Becker Principal Software Developer Oracle E-mail: joel.becker at oracle.com Phone: (650) 506-8127