Louis Rilling
2008-Jul-04 14:56 UTC
[Ocfs2-devel] [BUGFIX][PATCH v2 0/2] configfs: Fix cleanup after mkdir() failure
[ applies on top of http://lkml.org/lkml/2008/6/23/145 aka symlink() fixes ]
Hi,
This patchset fixes two kinds of bugs happening when
configfs_attach_group()/configfs_attach_item() fail and userspace races with
mkdir() or symlink().
Please read the first patch header for a detailed scenario explaining the bugs.
Louis
Changelog:
- Few code reworks as requested by Joel (details in patch headers)
Summary (2):
configfs: Prevent userspace from creating new entries under attaching
directories
configfs: Lock new directory inodes before removing on cleanup after
failure
fs/configfs/configfs_internal.h | 2 +
fs/configfs/dir.c | 141 ++++++++++++++++++++++++++++++++-------
fs/configfs/symlink.c | 15 ++++
3 files changed, 133 insertions(+), 25 deletions(-)
--
Dr Louis Rilling Kerlabs
Skype: louis.rilling Batiment Germanium
Phone: (+33|0) 6 80 89 08 23 80 avenue des Buttes de Coesmes
http://www.kerlabs.com/ 35700 Rennes
Louis Rilling
2008-Jul-04 14:56 UTC
[Ocfs2-devel] [BUGFIX][PATCH v2 1/2] configfs: Prevent userspace from creating new entries under attaching directories
process 1: process 2:
configfs_mkdir("A")
attach_group("A")
attach_item("A")
d_instantiate("A")
populate_groups("A")
mutex_lock("A")
attach_group("A/B")
attach_item("A")
d_instantiate("A/B")
mkdir("A/B/C")
do_path_lookup("A/B/C", LOOKUP_PARENT)
ok
lookup_create("A/B/C")
mutex_lock("A/B")
ok
configfs_mkdir("A/B/C")
ok
attach_group("A/C")
attach_item("A/C")
d_instantiate("A/C")
populate_groups("A/C")
mutex_lock("A/C")
attach_group("A/C/D")
attach_item("A/C/D")
failure
mutex_unlock("A/C")
detach_groups("A/C")
nothing to do
mkdir("A/C/E")
do_path_lookup("A/C/E", LOOKUP_PARENT)
ok
lookup_create("A/C/E")
mutex_lock("A/C")
ok
configfs_mkdir("A/C/E")
ok
detach_item("A/C")
d_delete("A/C")
mutex_unlock("A")
detach_groups("A")
mutex_lock("A/B")
detach_group("A/B")
detach_groups("A/B")
nothing since no _default_ group
detach_item("A/B")
mutex_unlock("A/B")
d_delete("A/B")
detach_item("A")
d_delete("A")
Two bugs:
1/ "A/B/C" and "A/C/E" are created, but never removed while
their parent are
removed in the end. The same could happen with symlink() instead of mkdir().
2/ "A" and "A/C" inodes are not locked while detach_item()
is called on them,
which may probably confuse VFS.
This commit fixes 1/, tagging new directories with CONFIGFS_USET_CREATING before
building the inode and instantiating the dentry, and validating the whole
group+default groups hierarchy in a second pass by clearing
CONFIGFS_USET_CREATING.
mkdir(), symlink(), lookup(), and dir_open() simply return -ENOENT if
called in (or linking to) a directory tagged with CONFIGFS_USET_CREATING. This
does not prevent userspace from calling stat() successfuly on such directories,
but this prevents userspace from adding (children to | symlinking from/to |
read/write attributes of | listing the contents of) not validated items. In
other words, userspace will not interact with the subsystem on a new item until
the new item creation completes correctly.
It was first proposed to re-use CONFIGFS_USET_IN_MKDIR instead of a new
flag CONFIGFS_USET_CREATING, but this generated conflicts when checking the
target of a new symlink: a valid target directory in the middle of attaching
a new user-created child item could be wrongly detected as being attached.
2/ is fixed by next commit.
Changelog:
- Rename configfs_validate_dir() in configfs_dir_set_ready()
- Introduce configfs_dirent_is_ready() helper to unbloat a bit
CONFIGFS_USET_CREATING checks
Signed-off-by: Louis Rilling <louis.rilling at kerlabs.com>
---
fs/configfs/configfs_internal.h | 2 +
fs/configfs/dir.c | 93 ++++++++++++++++++++++++++++++++++++---
fs/configfs/symlink.c | 15 ++++++
3 files changed, 104 insertions(+), 6 deletions(-)
diff --git a/fs/configfs/configfs_internal.h b/fs/configfs/configfs_internal.h
index 5f61b26..762d287 100644
--- a/fs/configfs/configfs_internal.h
+++ b/fs/configfs/configfs_internal.h
@@ -49,6 +49,7 @@ struct configfs_dirent {
#define CONFIGFS_USET_DEFAULT 0x0080
#define CONFIGFS_USET_DROPPING 0x0100
#define CONFIGFS_USET_IN_MKDIR 0x0200
+#define CONFIGFS_USET_CREATING 0x0400
#define CONFIGFS_NOT_PINNED (CONFIGFS_ITEM_ATTR)
extern struct mutex configfs_symlink_mutex;
@@ -67,6 +68,7 @@ extern void configfs_inode_exit(void);
extern int configfs_create_file(struct config_item *, const struct
configfs_attribute *);
extern int configfs_make_dirent(struct configfs_dirent *,
struct dentry *, void *, umode_t, int);
+extern int configfs_dirent_is_ready(struct configfs_dirent *);
extern int configfs_add_file(struct dentry *, const struct configfs_attribute
*, int);
extern void configfs_hash_and_remove(struct dentry * dir, const char * name);
diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c
index 2c873fd..8dd99a2 100644
--- a/fs/configfs/dir.c
+++ b/fs/configfs/dir.c
@@ -185,7 +185,7 @@ static int create_dir(struct config_item * k, struct dentry
* p,
error = configfs_dirent_exists(p->d_fsdata, d->d_name.name);
if (!error)
error = configfs_make_dirent(p->d_fsdata, d, k, mode,
- CONFIGFS_DIR);
+ CONFIGFS_DIR | CONFIGFS_USET_CREATING);
if (!error) {
error = configfs_create(d, mode, init_dir);
if (!error) {
@@ -209,6 +209,9 @@ static int create_dir(struct config_item * k, struct dentry
* p,
* configfs_create_dir - create a directory for an config_item.
* @item: config_itemwe're creating directory for.
* @dentry: config_item's dentry.
+ *
+ * Note: user-created entries won't be allowed under this new directory
+ * until it is validated by configfs_dir_set_ready()
*/
static int configfs_create_dir(struct config_item * item, struct dentry
*dentry)
@@ -231,6 +234,44 @@ static int configfs_create_dir(struct config_item * item,
struct dentry *dentry)
return error;
}
+/*
+ * Allow userspace to create new entries under a new directory created with
+ * configfs_create_dir(), and under all of its chidlren directories
recursively.
+ * @sd configfs_dirent of the new directory to validate
+ *
+ * Caller must hold configfs_dirent_lock.
+ */
+static void configfs_dir_set_ready(struct configfs_dirent *sd)
+{
+ struct configfs_dirent *child_sd;
+
+ sd->s_type &= ~CONFIGFS_USET_CREATING;
+ list_for_each_entry(child_sd, &sd->s_children, s_sibling)
+ if (child_sd->s_type & CONFIGFS_USET_CREATING)
+ configfs_dir_set_ready(child_sd);
+}
+
+/*
+ * Check that a directory does not belong to a directory hierarchy being
+ * attached and not validated yet.
+ * @sd configfs_dirent of the directory to check
+ *
+ * @return non-zero iff the directory was validated
+ *
+ * Note: takes configfs_dirent_lock, so the result may change from false to
true
+ * in two consecutive calls, but never from true to false.
+ */
+int configfs_dirent_is_ready(struct configfs_dirent *sd)
+{
+ int ret;
+
+ spin_lock(&configfs_dirent_lock);
+ ret = !(sd->s_type & CONFIGFS_USET_CREATING);
+ spin_unlock(&configfs_dirent_lock);
+
+ return ret;
+}
+
int configfs_create_link(struct configfs_symlink *sl,
struct dentry *parent,
struct dentry *dentry)
@@ -330,7 +371,19 @@ static struct dentry * configfs_lookup(struct inode *dir,
struct configfs_dirent * parent_sd = dentry->d_parent->d_fsdata;
struct configfs_dirent * sd;
int found = 0;
- int err = 0;
+ int err;
+
+ /*
+ * Fake invisibility if dir belongs to a group/default groups hierarchy
+ * being attached
+ *
+ * This forbids userspace to read/write attributes of items which may
+ * not complete their initialization, since the dentries of the
+ * attributes won't be instantiated.
+ */
+ err = -ENOENT;
+ if (!configfs_dirent_is_ready(parent_sd))
+ goto out;
list_for_each_entry(sd, &parent_sd->s_children, s_sibling) {
if (sd->s_type & CONFIGFS_NOT_PINNED) {
@@ -353,6 +406,7 @@ static struct dentry * configfs_lookup(struct inode *dir,
return simple_lookup(dir, dentry, nd);
}
+out:
return ERR_PTR(err);
}
@@ -1027,7 +1081,7 @@ EXPORT_SYMBOL(configfs_undepend_item);
static int configfs_mkdir(struct inode *dir, struct dentry *dentry, int mode)
{
- int ret, module_got = 0;
+ int ret = 0, module_got = 0;
struct config_group *group;
struct config_item *item;
struct config_item *parent_item;
@@ -1043,6 +1097,16 @@ static int configfs_mkdir(struct inode *dir, struct
dentry *dentry, int mode)
}
sd = dentry->d_parent->d_fsdata;
+
+ /*
+ * Fake invisibility if dir belongs to a group/default groups hierarchy
+ * being attached
+ */
+ if (!configfs_dirent_is_ready(sd)) {
+ ret = -ENOENT;
+ goto out;
+ }
+
if (!(sd->s_type & CONFIGFS_USET_DIR)) {
ret = -EPERM;
goto out;
@@ -1137,6 +1201,8 @@ static int configfs_mkdir(struct inode *dir, struct dentry
*dentry, int mode)
spin_lock(&configfs_dirent_lock);
sd->s_type &= ~CONFIGFS_USET_IN_MKDIR;
+ if (!ret)
+ configfs_dir_set_ready(dentry->d_fsdata);
spin_unlock(&configfs_dirent_lock);
out_unlink:
@@ -1317,13 +1383,24 @@ static int configfs_dir_open(struct inode *inode, struct
file *file)
{
struct dentry * dentry = file->f_path.dentry;
struct configfs_dirent * parent_sd = dentry->d_fsdata;
+ int err;
mutex_lock(&dentry->d_inode->i_mutex);
- file->private_data = configfs_new_dirent(parent_sd, NULL);
+ /*
+ * Fake invisibility if dir belongs to a group/default groups hierarchy
+ * being attached
+ */
+ err = -ENOENT;
+ if (configfs_dirent_is_ready(parent_sd)) {
+ file->private_data = configfs_new_dirent(parent_sd, NULL);
+ if (IS_ERR(file->private_data))
+ err = PTR_ERR(file->private_data);
+ else
+ err = 0;
+ }
mutex_unlock(&dentry->d_inode->i_mutex);
- return IS_ERR(file->private_data) ? PTR_ERR(file->private_data) : 0;
-
+ return err;
}
static int configfs_dir_close(struct inode *inode, struct file *file)
@@ -1494,6 +1571,10 @@ int configfs_register_subsystem(struct configfs_subsystem
*subsys)
if (err) {
d_delete(dentry);
dput(dentry);
+ } else {
+ spin_lock(&configfs_dirent_lock);
+ configfs_dir_set_ready(dentry->d_fsdata);
+ spin_unlock(&configfs_dirent_lock);
}
}
diff --git a/fs/configfs/symlink.c b/fs/configfs/symlink.c
index 21bd751..e29a77e 100644
--- a/fs/configfs/symlink.c
+++ b/fs/configfs/symlink.c
@@ -76,6 +76,9 @@ static int create_link(struct config_item *parent_item,
struct configfs_symlink *sl;
int ret;
+ ret = -ENOENT;
+ if (!configfs_dirent_is_ready(target_sd))
+ goto out;
ret = -ENOMEM;
sl = kmalloc(sizeof(struct configfs_symlink), GFP_KERNEL);
if (sl) {
@@ -100,6 +103,7 @@ static int create_link(struct config_item *parent_item,
}
}
+out:
return ret;
}
@@ -129,6 +133,7 @@ int configfs_symlink(struct inode *dir, struct dentry
*dentry, const char *symna
{
int ret;
struct nameidata nd;
+ struct configfs_dirent *sd;
struct config_item *parent_item;
struct config_item *target_item;
struct config_item_type *type;
@@ -137,9 +142,19 @@ int configfs_symlink(struct inode *dir, struct dentry
*dentry, const char *symna
if (dentry->d_parent == configfs_sb->s_root)
goto out;
+ sd = dentry->d_parent->d_fsdata;
+ /*
+ * Fake invisibility if dir belongs to a group/default groups hierarchy
+ * being attached
+ */
+ ret = -ENOENT;
+ if (!configfs_dirent_is_ready(sd))
+ goto out;
+
parent_item = configfs_get_config_item(dentry->d_parent);
type = parent_item->ci_type;
+ ret = -EPERM;
if (!type || !type->ct_item_ops ||
!type->ct_item_ops->allow_link)
goto out_put;
--
1.5.5.3
Louis Rilling
2008-Jul-04 14:56 UTC
[Ocfs2-devel] [BUGFIX][PATCH v2 2/2] configfs: Lock new directory inodes before removing on cleanup after failure
Once a new configfs directory is created by configfs_attach_item() or
configfs_attach_group(), a failure in the remaining initialization steps leads
to removing a directory which inode the VFS may have already accessed.
This commit adds the necessary inode locking to safely remove configfs
directories while cleaning up after a failure. As an advantage, the locking
rules of populate_groups() and detach_groups() become the same: the caller must
have the group's inode mutex locked.
Changelog:
- Improve comments and braces as requested by Joel
Signed-off-by: Louis Rilling <louis.rilling at kerlabs.com>
---
fs/configfs/dir.c | 48 +++++++++++++++++++++++++++++-------------------
1 files changed, 29 insertions(+), 19 deletions(-)
diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c
index 8dd99a2..4252278 100644
--- a/fs/configfs/dir.c
+++ b/fs/configfs/dir.c
@@ -324,6 +324,8 @@ static void remove_dir(struct dentry * d)
* The only thing special about this is that we remove any files in
* the directory before we remove the directory, and we've inlined
* what used to be configfs_rmdir() below, instead of calling separately.
+ *
+ * Caller holds the mutex of the item's inode
*/
static void configfs_remove_dir(struct config_item * item)
@@ -612,36 +614,21 @@ static int create_default_group(struct config_group
*parent_group,
static int populate_groups(struct config_group *group)
{
struct config_group *new_group;
- struct dentry *dentry = group->cg_item.ci_dentry;
int ret = 0;
int i;
if (group->default_groups) {
- /*
- * FYI, we're faking mkdir here
- * I'm not sure we need this semaphore, as we're called
- * from our parent's mkdir. That holds our parent's
- * i_mutex, so afaik lookup cannot continue through our
- * parent to find us, let alone mess with our tree.
- * That said, taking our i_mutex is closer to mkdir
- * emulation, and shouldn't hurt.
- */
- mutex_lock_nested(&dentry->d_inode->i_mutex, I_MUTEX_CHILD);
-
for (i = 0; group->default_groups[i]; i++) {
new_group = group->default_groups[i];
ret = create_default_group(group, new_group);
- if (ret)
+ if (ret) {
+ detach_groups(group);
break;
+ }
}
-
- mutex_unlock(&dentry->d_inode->i_mutex);
}
- if (ret)
- detach_groups(group);
-
return ret;
}
@@ -756,7 +743,15 @@ static int configfs_attach_item(struct config_item
*parent_item,
if (!ret) {
ret = populate_attrs(item);
if (ret) {
+ /*
+ * We are going to remove an inode and its dentry but
+ * the VFS may already have hit and used them. Thus,
+ * we must lock them as rmdir() would.
+ */
+ mutex_lock(&dentry->d_inode->i_mutex);
configfs_remove_dir(item);
+ dentry->d_inode->i_flags |= S_DEAD;
+ mutex_unlock(&dentry->d_inode->i_mutex);
d_delete(dentry);
}
}
@@ -764,6 +759,7 @@ static int configfs_attach_item(struct config_item
*parent_item,
return ret;
}
+/* Caller holds the mutex of the item's inode */
static void configfs_detach_item(struct config_item *item)
{
detach_attrs(item);
@@ -782,16 +778,30 @@ static int configfs_attach_group(struct config_item
*parent_item,
sd = dentry->d_fsdata;
sd->s_type |= CONFIGFS_USET_DIR;
+ /*
+ * FYI, we're faking mkdir in populate_groups()
+ * We must lock the group's inode to avoid races with the VFS
+ * which can already hit the inode and try to add/remove entries
+ * under it.
+ *
+ * We must also lock the inode to remove it safely in case of
+ * error, as rmdir() would.
+ */
+ mutex_lock_nested(&dentry->d_inode->i_mutex, I_MUTEX_CHILD);
ret = populate_groups(to_config_group(item));
if (ret) {
configfs_detach_item(item);
- d_delete(dentry);
+ dentry->d_inode->i_flags |= S_DEAD;
}
+ mutex_unlock(&dentry->d_inode->i_mutex);
+ if (ret)
+ d_delete(dentry);
}
return ret;
}
+/* Caller holds the mutex of the group's inode */
static void configfs_detach_group(struct config_item *item)
{
detach_groups(to_config_group(item));
--
1.5.5.3
Louis Rilling
2008-Jul-04 16:32 UTC
[Ocfs2-devel] [BUGFIX][PATCH v2 0/2] configfs: Fix cleanup after mkdir() failure
Joel, Sorry for posting to a wrong address (really fun to see you working at Kerlabs ;)). I've just discovered that git-send-email is case-sensitive with email aliases while mutt is not... Cheers, Louis On Fri, Jul 04, 2008 at 04:56:04PM +0200, Louis Rilling wrote:> [ applies on top of http://lkml.org/lkml/2008/6/23/145 aka symlink() fixes ] > > Hi, > > This patchset fixes two kinds of bugs happening when > configfs_attach_group()/configfs_attach_item() fail and userspace races with > mkdir() or symlink(). > > Please read the first patch header for a detailed scenario explaining the bugs. > > Louis > > Changelog: > - Few code reworks as requested by Joel (details in patch headers) > > Summary (2): > configfs: Prevent userspace from creating new entries under attaching > directories > configfs: Lock new directory inodes before removing on cleanup after > failure > > fs/configfs/configfs_internal.h | 2 + > fs/configfs/dir.c | 141 ++++++++++++++++++++++++++++++++------- > fs/configfs/symlink.c | 15 ++++ > 3 files changed, 133 insertions(+), 25 deletions(-) > > -- > Dr Louis Rilling Kerlabs > Skype: louis.rilling Batiment Germanium > Phone: (+33|0) 6 80 89 08 23 80 avenue des Buttes de Coesmes > http://www.kerlabs.com/ 35700 Rennes > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/-- Dr Louis Rilling Kerlabs Skype: louis.rilling Batiment Germanium Phone: (+33|0) 6 80 89 08 23 80 avenue des Buttes de Coesmes http://www.kerlabs.com/ 35700 Rennes -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: Digital signature Url : http://oss.oracle.com/pipermail/ocfs2-devel/attachments/20080704/ff9e1f00/attachment.bin
Joel Becker
2008-Jul-11 20:05 UTC
[Ocfs2-devel] [BUGFIX][PATCH v2 2/2] configfs: Lock new directory inodes before removing on cleanup after failure
On Fri, Jul 04, 2008 at 04:56:06PM +0200, Louis Rilling wrote:> Changelog: > - Improve comments and braces as requested by JoelAlso looking good, I want to test them and then send them along. Joel -- "When I am working on a problem I never think about beauty. I only think about how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." - Buckminster Fuller Joel Becker Principal Software Developer Oracle E-mail: joel.becker at oracle.com Phone: (650) 506-8127
Joel Becker
2008-Jul-11 22:07 UTC
[Ocfs2-devel] [BUGFIX][PATCH v2 0/2] configfs: Fix cleanup after mkdir() failure
On Fri, Jul 04, 2008 at 04:56:04PM +0200, Louis Rilling wrote:> This patchset fixes two kinds of bugs happening when > configfs_attach_group()/configfs_attach_item() fail and userspace races with > mkdir() or symlink().Ok, I've merged all this work. Please check out my configfs-ALL branch and make sure I have everything we've worked on. http://oss.oracle.com/git/?p=jlbec/linux-2.6.git;a=shortlog;h=configfs-ALL git://oss.oracle.com/git/jlbec/linux-2.6.git configfs-ALL Joel -- Life's Little Instruction Book #335 "Every so often, push your luck." Joel Becker Principal Software Developer Oracle E-mail: joel.becker at oracle.com Phone: (650) 506-8127