Tyler Hicks
2018-Jul-20 21:56 UTC
[Bridge] [PATCH net-next v3 0/8] Make /sys/class/net per net namespace objects belong to container
This is a revival of an older patch set from Dmitry Torokhov: https://lore.kernel.org/lkml/1471386795-32918-1-git-send-email-dmitry.torokhov at gmail.com/ My submission of v2 is here: https://lore.kernel.org/lkml/1531497949-1766-1-git-send-email-tyhicks at canonical.com/ Here's Dmitry's description: There are objects in /sys hierarchy (/sys/class/net/) that logically belong to a namespace/container. Unfortunately all sysfs objects start their life belonging to global root, and while we could change ownership manually, keeping tracks of all objects that come and go is cumbersome. It would be better if kernel created them using correct uid/gid from the beginning. This series changes kernfs to allow creating object's with arbitrary uid/gid, adds get_ownership() callback to ktype structure so subsystems could supply their own logic (likely tied to namespace support) for determining ownership of kobjects, and adjusts sysfs code to make use of this information. Lastly net-sysfs is adjusted to make sure that objects in net namespace are owned by the root user from the owning user namespace. Note that we do not adjust ownership of objects moved into a new namespace (as when moving a network device into a container) as userspace can easily do it. I'm reviving this patch set because we would like this feature for system containers. One specific use case that we have is that libvirt is unable to configure its bridge device inside of a system container due to the bridge files in /sys/class/net/ being owned by init root instead of container root. The last two patches in this set are patches that I've added to Dmitry's original set to allow such configuration of the bridge device. Eric had previously provided feedback that he didn't favor these changes affecting all layers of the stack and that most of the changes could remain local to drivers/base/core.c. That feedback is certainly sensible but I wanted to send out v2 of the patch set without making that large of a change since quite a bit of time has passed and the bridge changes in the last patch of this set shows that not all of the changes will be local to drivers/base/core.c. I'm happy to make the changes if the original request still stands. * Changes since v2: - Added my Co-Developed-by and Signed-off-by tags to all of Dmitry's patches that I've modified - Patch 1 received build failure fixes in arch/x86/kernel/cpu/intel_rdt_rdtgroup.c - Patch 2 was updated to drop the declaration of sysfs_add_file() from sysfs.h since the patch removed all other uses of the function - Patch 5 is a new patch that prevents tx_maxrate from being written to from inside of a container + Maybe I'm being too cautious here but the restriction can always be loosened up later - Patches 6 and 7 were updated to make net_ns_get_ownership() always initialize uid and gid, even when the network namespace is NULL, so that it isn't a dangerous function to reuse + Requested by Christian Brauner - I've looked at all sysfs attributes affected by this patch set and feel comfortable about the changes. There are quite a few affected attributes that don't have any capable()/ns_capable() checks in their store operations (per_bond_attrs, at91_sysfs_attrs, sysfs_grcan_attrs, ican3_sysfs_attrs, cdc_ncm_sysfs_attrs, qmi_wwan_sysfs_attrs) but I think this is acceptable. It means that container root, rather than specifically CAP_NET_ADMIN inside of the network namespace that the device belongs to, can write to those device attributes. It's the same situation that those devices have today in that init root is able to write to the attributes without necessarily having CAP_NET_ADMIN. I think that this should probably be fixed in order to be consistent with what netdev_store() does by verifying CAP_NET_ADMIN in the network namespace but that it doesn't need to happen in this patch set. Thanks! Tyler
Tyler Hicks
2018-Jul-20 21:56 UTC
[Bridge] [PATCH net-next v3 1/8] kernfs: allow creating kernfs objects with arbitrary uid/gid
From: Dmitry Torokhov <dmitry.torokhov at gmail.com> This change allows creating kernfs files and directories with arbitrary uid/gid instead of always using GLOBAL_ROOT_UID/GID by extending kernfs_create_dir_ns() and kernfs_create_file_ns() with uid/gid arguments. The "simple" kernfs_create_file() and kernfs_create_dir() are left alone and always create objects belonging to the global root. When creating symlinks ownership (uid/gid) is taken from the target kernfs object. Co-Developed-by: Tyler Hicks <tyhicks at canonical.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov at gmail.com> Signed-off-by: Tyler Hicks <tyhicks at canonical.com> --- arch/x86/kernel/cpu/intel_rdt_rdtgroup.c | 4 +++- fs/kernfs/dir.c | 29 ++++++++++++++++++++++++++--- fs/kernfs/file.c | 8 ++++++-- fs/kernfs/inode.c | 2 +- fs/kernfs/kernfs-internal.h | 2 ++ fs/kernfs/symlink.c | 11 ++++++++++- fs/sysfs/dir.c | 4 +++- fs/sysfs/file.c | 5 +++-- include/linux/kernfs.h | 28 +++++++++++++++++++--------- kernel/cgroup/cgroup.c | 4 +++- 10 files changed, 76 insertions(+), 21 deletions(-) diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c index 749856a2e736..9af1a21265d3 100644 --- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c +++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c @@ -146,6 +146,7 @@ static int rdtgroup_add_file(struct kernfs_node *parent_kn, struct rftype *rft) int ret; kn = __kernfs_create_file(parent_kn, rft->name, rft->mode, + GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0, rft->kf_ops, rft, NULL, NULL); if (IS_ERR(kn)) return PTR_ERR(kn); @@ -1503,7 +1504,8 @@ static int mon_addfile(struct kernfs_node *parent_kn, const char *name, struct kernfs_node *kn; int ret = 0; - kn = __kernfs_create_file(parent_kn, name, 0444, 0, + kn = __kernfs_create_file(parent_kn, name, 0444, + GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, 0, &kf_mondata_ops, priv, NULL, NULL); if (IS_ERR(kn)) return PTR_ERR(kn); diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index d66cc0777303..4ca0b5c18192 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -619,6 +619,7 @@ struct kernfs_node *kernfs_node_from_dentry(struct dentry *dentry) static struct kernfs_node *__kernfs_new_node(struct kernfs_root *root, const char *name, umode_t mode, + kuid_t uid, kgid_t gid, unsigned flags) { struct kernfs_node *kn; @@ -661,8 +662,22 @@ static struct kernfs_node *__kernfs_new_node(struct kernfs_root *root, kn->mode = mode; kn->flags = flags; + if (!uid_eq(uid, GLOBAL_ROOT_UID) || !gid_eq(gid, GLOBAL_ROOT_GID)) { + struct iattr iattr = { + .ia_valid = ATTR_UID | ATTR_GID, + .ia_uid = uid, + .ia_gid = gid, + }; + + ret = __kernfs_setattr(kn, &iattr); + if (ret < 0) + goto err_out3; + } + return kn; + err_out3: + idr_remove(&root->ino_idr, kn->id.ino); err_out2: kmem_cache_free(kernfs_node_cache, kn); err_out1: @@ -672,11 +687,13 @@ static struct kernfs_node *__kernfs_new_node(struct kernfs_root *root, struct kernfs_node *kernfs_new_node(struct kernfs_node *parent, const char *name, umode_t mode, + kuid_t uid, kgid_t gid, unsigned flags) { struct kernfs_node *kn; - kn = __kernfs_new_node(kernfs_root(parent), name, mode, flags); + kn = __kernfs_new_node(kernfs_root(parent), + name, mode, uid, gid, flags); if (kn) { kernfs_get(parent); kn->parent = parent; @@ -946,6 +963,7 @@ struct kernfs_root *kernfs_create_root(struct kernfs_syscall_ops *scops, root->next_generation = 1; kn = __kernfs_new_node(root, "", S_IFDIR | S_IRUGO | S_IXUGO, + GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, KERNFS_DIR); if (!kn) { idr_destroy(&root->ino_idr); @@ -984,6 +1002,8 @@ void kernfs_destroy_root(struct kernfs_root *root) * @parent: parent in which to create a new directory * @name: name of the new directory * @mode: mode of the new directory + * @uid: uid of the new directory + * @gid: gid of the new directory * @priv: opaque data associated with the new directory * @ns: optional namespace tag of the directory * @@ -991,13 +1011,15 @@ void kernfs_destroy_root(struct kernfs_root *root) */ struct kernfs_node *kernfs_create_dir_ns(struct kernfs_node *parent, const char *name, umode_t mode, + kuid_t uid, kgid_t gid, void *priv, const void *ns) { struct kernfs_node *kn; int rc; /* allocate */ - kn = kernfs_new_node(parent, name, mode | S_IFDIR, KERNFS_DIR); + kn = kernfs_new_node(parent, name, mode | S_IFDIR, + uid, gid, KERNFS_DIR); if (!kn) return ERR_PTR(-ENOMEM); @@ -1028,7 +1050,8 @@ struct kernfs_node *kernfs_create_empty_dir(struct kernfs_node *parent, int rc; /* allocate */ - kn = kernfs_new_node(parent, name, S_IRUGO|S_IXUGO|S_IFDIR, KERNFS_DIR); + kn = kernfs_new_node(parent, name, S_IRUGO|S_IXUGO|S_IFDIR, + GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, KERNFS_DIR); if (!kn) return ERR_PTR(-ENOMEM); diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c index 2015d8c45e4a..dbf5bc250bfd 100644 --- a/fs/kernfs/file.c +++ b/fs/kernfs/file.c @@ -965,6 +965,8 @@ const struct file_operations kernfs_file_fops = { * @parent: directory to create the file in * @name: name of the file * @mode: mode of the file + * @uid: uid of the file + * @gid: gid of the file * @size: size of the file * @ops: kernfs operations for the file * @priv: private data for the file @@ -975,7 +977,8 @@ const struct file_operations kernfs_file_fops = { */ struct kernfs_node *__kernfs_create_file(struct kernfs_node *parent, const char *name, - umode_t mode, loff_t size, + umode_t mode, kuid_t uid, kgid_t gid, + loff_t size, const struct kernfs_ops *ops, void *priv, const void *ns, struct lock_class_key *key) @@ -986,7 +989,8 @@ struct kernfs_node *__kernfs_create_file(struct kernfs_node *parent, flags = KERNFS_FILE; - kn = kernfs_new_node(parent, name, (mode & S_IALLUGO) | S_IFREG, flags); + kn = kernfs_new_node(parent, name, (mode & S_IALLUGO) | S_IFREG, + uid, gid, flags); if (!kn) return ERR_PTR(-ENOMEM); diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c index 3d73fe9d56e2..80cebcd94c90 100644 --- a/fs/kernfs/inode.c +++ b/fs/kernfs/inode.c @@ -63,7 +63,7 @@ static struct kernfs_iattrs *kernfs_iattrs(struct kernfs_node *kn) return ret; } -static int __kernfs_setattr(struct kernfs_node *kn, const struct iattr *iattr) +int __kernfs_setattr(struct kernfs_node *kn, const struct iattr *iattr) { struct kernfs_iattrs *attrs; struct iattr *iattrs; diff --git a/fs/kernfs/kernfs-internal.h b/fs/kernfs/kernfs-internal.h index 0f260dcca177..3d83b114bb08 100644 --- a/fs/kernfs/kernfs-internal.h +++ b/fs/kernfs/kernfs-internal.h @@ -90,6 +90,7 @@ int kernfs_iop_setattr(struct dentry *dentry, struct iattr *iattr); int kernfs_iop_getattr(const struct path *path, struct kstat *stat, u32 request_mask, unsigned int query_flags); ssize_t kernfs_iop_listxattr(struct dentry *dentry, char *buf, size_t size); +int __kernfs_setattr(struct kernfs_node *kn, const struct iattr *iattr); /* * dir.c @@ -104,6 +105,7 @@ void kernfs_put_active(struct kernfs_node *kn); int kernfs_add_one(struct kernfs_node *kn); struct kernfs_node *kernfs_new_node(struct kernfs_node *parent, const char *name, umode_t mode, + kuid_t uid, kgid_t gid, unsigned flags); struct kernfs_node *kernfs_find_and_get_node_by_ino(struct kernfs_root *root, unsigned int ino); diff --git a/fs/kernfs/symlink.c b/fs/kernfs/symlink.c index 08ccabd7047f..5ffed48f3d0e 100644 --- a/fs/kernfs/symlink.c +++ b/fs/kernfs/symlink.c @@ -21,6 +21,7 @@ * @target: target node for the symlink to point to * * Returns the created node on success, ERR_PTR() value on error. + * Ownership of the link matches ownership of the target. */ struct kernfs_node *kernfs_create_link(struct kernfs_node *parent, const char *name, @@ -28,8 +29,16 @@ struct kernfs_node *kernfs_create_link(struct kernfs_node *parent, { struct kernfs_node *kn; int error; + kuid_t uid = GLOBAL_ROOT_UID; + kgid_t gid = GLOBAL_ROOT_GID; - kn = kernfs_new_node(parent, name, S_IFLNK|S_IRWXUGO, KERNFS_LINK); + if (target->iattr) { + uid = target->iattr->ia_iattr.ia_uid; + gid = target->iattr->ia_iattr.ia_gid; + } + + kn = kernfs_new_node(parent, name, S_IFLNK|S_IRWXUGO, uid, gid, + KERNFS_LINK); if (!kn) return ERR_PTR(-ENOMEM); diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c index 58eba92a0e41..e39b884f0867 100644 --- a/fs/sysfs/dir.c +++ b/fs/sysfs/dir.c @@ -52,7 +52,9 @@ int sysfs_create_dir_ns(struct kobject *kobj, const void *ns) return -ENOENT; kn = kernfs_create_dir_ns(parent, kobject_name(kobj), - S_IRWXU | S_IRUGO | S_IXUGO, kobj, ns); + S_IRWXU | S_IRUGO | S_IXUGO, + GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, + kobj, ns); if (IS_ERR(kn)) { if (PTR_ERR(kn) == -EEXIST) sysfs_warn_dup(parent, kobject_name(kobj)); diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c index 5c13f29bfcdb..513fa691ecbd 100644 --- a/fs/sysfs/file.c +++ b/fs/sysfs/file.c @@ -302,8 +302,9 @@ int sysfs_add_file_mode_ns(struct kernfs_node *parent, if (!attr->ignore_lockdep) key = attr->key ?: (struct lock_class_key *)&attr->skey; #endif - kn = __kernfs_create_file(parent, attr->name, mode & 0777, size, ops, - (void *)attr, ns, key); + kn = __kernfs_create_file(parent, attr->name, + mode & 0777, GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, + size, ops, (void *)attr, ns, key); if (IS_ERR(kn)) { if (PTR_ERR(kn) == -EEXIST) sysfs_warn_dup(parent, attr->name); diff --git a/include/linux/kernfs.h b/include/linux/kernfs.h index ab25c8b6d9e3..814643f7ee52 100644 --- a/include/linux/kernfs.h +++ b/include/linux/kernfs.h @@ -15,6 +15,7 @@ #include <linux/lockdep.h> #include <linux/rbtree.h> #include <linux/atomic.h> +#include <linux/uidgid.h> #include <linux/wait.h> struct file; @@ -325,12 +326,14 @@ void kernfs_destroy_root(struct kernfs_root *root); struct kernfs_node *kernfs_create_dir_ns(struct kernfs_node *parent, const char *name, umode_t mode, + kuid_t uid, kgid_t gid, void *priv, const void *ns); struct kernfs_node *kernfs_create_empty_dir(struct kernfs_node *parent, const char *name); struct kernfs_node *__kernfs_create_file(struct kernfs_node *parent, - const char *name, - umode_t mode, loff_t size, + const char *name, umode_t mode, + kuid_t uid, kgid_t gid, + loff_t size, const struct kernfs_ops *ops, void *priv, const void *ns, struct lock_class_key *key); @@ -415,12 +418,14 @@ static inline void kernfs_destroy_root(struct kernfs_root *root) { } static inline struct kernfs_node * kernfs_create_dir_ns(struct kernfs_node *parent, const char *name, - umode_t mode, void *priv, const void *ns) + umode_t mode, kuid_t uid, kgid_t gid, + void *priv, const void *ns) { return ERR_PTR(-ENOSYS); } static inline struct kernfs_node * __kernfs_create_file(struct kernfs_node *parent, const char *name, - umode_t mode, loff_t size, const struct kernfs_ops *ops, + umode_t mode, kuid_t uid, kgid_t gid, + loff_t size, const struct kernfs_ops *ops, void *priv, const void *ns, struct lock_class_key *key) { return ERR_PTR(-ENOSYS); } @@ -498,12 +503,15 @@ static inline struct kernfs_node * kernfs_create_dir(struct kernfs_node *parent, const char *name, umode_t mode, void *priv) { - return kernfs_create_dir_ns(parent, name, mode, priv, NULL); + return kernfs_create_dir_ns(parent, name, mode, + GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, + priv, NULL); } static inline struct kernfs_node * kernfs_create_file_ns(struct kernfs_node *parent, const char *name, - umode_t mode, loff_t size, const struct kernfs_ops *ops, + umode_t mode, kuid_t uid, kgid_t gid, + loff_t size, const struct kernfs_ops *ops, void *priv, const void *ns) { struct lock_class_key *key = NULL; @@ -511,15 +519,17 @@ kernfs_create_file_ns(struct kernfs_node *parent, const char *name, #ifdef CONFIG_DEBUG_LOCK_ALLOC key = (struct lock_class_key *)&ops->lockdep_key; #endif - return __kernfs_create_file(parent, name, mode, size, ops, priv, ns, - key); + return __kernfs_create_file(parent, name, mode, uid, gid, + size, ops, priv, ns, key); } static inline struct kernfs_node * kernfs_create_file(struct kernfs_node *parent, const char *name, umode_t mode, loff_t size, const struct kernfs_ops *ops, void *priv) { - return kernfs_create_file_ns(parent, name, mode, size, ops, priv, NULL); + return kernfs_create_file_ns(parent, name, mode, + GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, + size, ops, priv, NULL); } static inline int kernfs_remove_by_name(struct kernfs_node *parent, diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 077370bf8964..35cf3d71f8aa 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -3557,7 +3557,9 @@ static int cgroup_add_file(struct cgroup_subsys_state *css, struct cgroup *cgrp, key = &cft->lockdep_key; #endif kn = __kernfs_create_file(cgrp->kn, cgroup_file_name(cgrp, cft, name), - cgroup_file_mode(cft), 0, cft->kf_ops, cft, + cgroup_file_mode(cft), + GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, + 0, cft->kf_ops, cft, NULL, key); if (IS_ERR(kn)) return PTR_ERR(kn); -- 2.7.4
Tyler Hicks
2018-Jul-20 21:56 UTC
[Bridge] [PATCH net-next v3 2/8] sysfs, kobject: allow creating kobject belonging to arbitrary users
From: Dmitry Torokhov <dmitry.torokhov at gmail.com> Normally kobjects and their sysfs representation belong to global root, however it is not necessarily the case for objects in separate namespaces. For example, objects in separate network namespace logically belong to the container's root and not global root. This change lays groundwork for allowing network namespace objects ownership to be transferred to container's root user by defining get_ownership() callback in ktype structure and using it in sysfs code to retrieve desired uid/gid when creating sysfs objects for given kobject. Co-Developed-by: Tyler Hicks <tyhicks at canonical.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov at gmail.com> Signed-off-by: Tyler Hicks <tyhicks at canonical.com> --- fs/sysfs/dir.c | 7 +++++-- fs/sysfs/file.c | 32 ++++++++++++++++++++------------ fs/sysfs/group.c | 23 +++++++++++++++++------ fs/sysfs/sysfs.h | 5 ++--- include/linux/kobject.h | 4 ++++ lib/kobject.c | 19 +++++++++++++++++++ 6 files changed, 67 insertions(+), 23 deletions(-) diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c index e39b884f0867..feeae8081c22 100644 --- a/fs/sysfs/dir.c +++ b/fs/sysfs/dir.c @@ -40,6 +40,8 @@ void sysfs_warn_dup(struct kernfs_node *parent, const char *name) int sysfs_create_dir_ns(struct kobject *kobj, const void *ns) { struct kernfs_node *parent, *kn; + kuid_t uid; + kgid_t gid; BUG_ON(!kobj); @@ -51,9 +53,10 @@ int sysfs_create_dir_ns(struct kobject *kobj, const void *ns) if (!parent) return -ENOENT; + kobject_get_ownership(kobj, &uid, &gid); + kn = kernfs_create_dir_ns(parent, kobject_name(kobj), - S_IRWXU | S_IRUGO | S_IXUGO, - GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, + S_IRWXU | S_IRUGO | S_IXUGO, uid, gid, kobj, ns); if (IS_ERR(kn)) { if (PTR_ERR(kn) == -EEXIST) diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c index 513fa691ecbd..fa46216523cf 100644 --- a/fs/sysfs/file.c +++ b/fs/sysfs/file.c @@ -245,7 +245,7 @@ static const struct kernfs_ops sysfs_bin_kfops_mmap = { int sysfs_add_file_mode_ns(struct kernfs_node *parent, const struct attribute *attr, bool is_bin, - umode_t mode, const void *ns) + umode_t mode, kuid_t uid, kgid_t gid, const void *ns) { struct lock_class_key *key = NULL; const struct kernfs_ops *ops; @@ -302,8 +302,8 @@ int sysfs_add_file_mode_ns(struct kernfs_node *parent, if (!attr->ignore_lockdep) key = attr->key ?: (struct lock_class_key *)&attr->skey; #endif - kn = __kernfs_create_file(parent, attr->name, - mode & 0777, GLOBAL_ROOT_UID, GLOBAL_ROOT_GID, + + kn = __kernfs_create_file(parent, attr->name, mode & 0777, uid, gid, size, ops, (void *)attr, ns, key); if (IS_ERR(kn)) { if (PTR_ERR(kn) == -EEXIST) @@ -313,12 +313,6 @@ int sysfs_add_file_mode_ns(struct kernfs_node *parent, return 0; } -int sysfs_add_file(struct kernfs_node *parent, const struct attribute *attr, - bool is_bin) -{ - return sysfs_add_file_mode_ns(parent, attr, is_bin, attr->mode, NULL); -} - /** * sysfs_create_file_ns - create an attribute file for an object with custom ns * @kobj: object we're creating for @@ -328,9 +322,14 @@ int sysfs_add_file(struct kernfs_node *parent, const struct attribute *attr, int sysfs_create_file_ns(struct kobject *kobj, const struct attribute *attr, const void *ns) { + kuid_t uid; + kgid_t gid; + BUG_ON(!kobj || !kobj->sd || !attr); - return sysfs_add_file_mode_ns(kobj->sd, attr, false, attr->mode, ns); + kobject_get_ownership(kobj, &uid, &gid); + return sysfs_add_file_mode_ns(kobj->sd, attr, false, attr->mode, + uid, gid, ns); } EXPORT_SYMBOL_GPL(sysfs_create_file_ns); @@ -359,6 +358,8 @@ int sysfs_add_file_to_group(struct kobject *kobj, const struct attribute *attr, const char *group) { struct kernfs_node *parent; + kuid_t uid; + kgid_t gid; int error; if (group) { @@ -371,7 +372,9 @@ int sysfs_add_file_to_group(struct kobject *kobj, if (!parent) return -ENOENT; - error = sysfs_add_file(parent, attr, false); + kobject_get_ownership(kobj, &uid, &gid); + error = sysfs_add_file_mode_ns(kobj->sd, attr, false, + attr->mode, uid, gid, NULL); kernfs_put(parent); return error; @@ -487,9 +490,14 @@ EXPORT_SYMBOL_GPL(sysfs_remove_file_from_group); int sysfs_create_bin_file(struct kobject *kobj, const struct bin_attribute *attr) { + kuid_t uid; + kgid_t gid; + BUG_ON(!kobj || !kobj->sd || !attr); - return sysfs_add_file(kobj->sd, &attr->attr, true); + kobject_get_ownership(kobj, &uid, &gid); + return sysfs_add_file_mode_ns(kobj->sd, &attr->attr, true, + attr->attr.mode, uid, gid, NULL); } EXPORT_SYMBOL_GPL(sysfs_create_bin_file); diff --git a/fs/sysfs/group.c b/fs/sysfs/group.c index 4802ec0e1e3a..c7a716c4acc9 100644 --- a/fs/sysfs/group.c +++ b/fs/sysfs/group.c @@ -31,6 +31,7 @@ static void remove_files(struct kernfs_node *parent, } static int create_files(struct kernfs_node *parent, struct kobject *kobj, + kuid_t uid, kgid_t gid, const struct attribute_group *grp, int update) { struct attribute *const *attr; @@ -60,7 +61,7 @@ static int create_files(struct kernfs_node *parent, struct kobject *kobj, mode &= SYSFS_PREALLOC | 0664; error = sysfs_add_file_mode_ns(parent, *attr, false, - mode, NULL); + mode, uid, gid, NULL); if (unlikely(error)) break; } @@ -90,7 +91,8 @@ static int create_files(struct kernfs_node *parent, struct kobject *kobj, mode &= SYSFS_PREALLOC | 0664; error = sysfs_add_file_mode_ns(parent, &(*bin_attr)->attr, true, - mode, NULL); + mode, + uid, gid, NULL); if (error) break; } @@ -106,6 +108,8 @@ static int internal_create_group(struct kobject *kobj, int update, const struct attribute_group *grp) { struct kernfs_node *kn; + kuid_t uid; + kgid_t gid; int error; BUG_ON(!kobj || (!update && !kobj->sd)); @@ -118,9 +122,11 @@ static int internal_create_group(struct kobject *kobj, int update, kobj->name, grp->name ?: ""); return -EINVAL; } + kobject_get_ownership(kobj, &uid, &gid); if (grp->name) { - kn = kernfs_create_dir(kobj->sd, grp->name, - S_IRWXU | S_IRUGO | S_IXUGO, kobj); + kn = kernfs_create_dir_ns(kobj->sd, grp->name, + S_IRWXU | S_IRUGO | S_IXUGO, + uid, gid, kobj, NULL); if (IS_ERR(kn)) { if (PTR_ERR(kn) == -EEXIST) sysfs_warn_dup(kobj->sd, grp->name); @@ -129,7 +135,7 @@ static int internal_create_group(struct kobject *kobj, int update, } else kn = kobj->sd; kernfs_get(kn); - error = create_files(kn, kobj, grp, update); + error = create_files(kn, kobj, uid, gid, grp, update); if (error) { if (grp->name) kernfs_remove(kn); @@ -281,6 +287,8 @@ int sysfs_merge_group(struct kobject *kobj, const struct attribute_group *grp) { struct kernfs_node *parent; + kuid_t uid; + kgid_t gid; int error = 0; struct attribute *const *attr; int i; @@ -289,8 +297,11 @@ int sysfs_merge_group(struct kobject *kobj, if (!parent) return -ENOENT; + kobject_get_ownership(kobj, &uid, &gid); + for ((i = 0, attr = grp->attrs); *attr && !error; (++i, ++attr)) - error = sysfs_add_file(parent, *attr, false); + error = sysfs_add_file_mode_ns(parent, *attr, false, + (*attr)->mode, uid, gid, NULL); if (error) { while (--i >= 0) kernfs_remove_by_name(parent, (*--attr)->name); diff --git a/fs/sysfs/sysfs.h b/fs/sysfs/sysfs.h index d098e015fcc9..0050cc0c0236 100644 --- a/fs/sysfs/sysfs.h +++ b/fs/sysfs/sysfs.h @@ -27,11 +27,10 @@ void sysfs_warn_dup(struct kernfs_node *parent, const char *name); /* * file.c */ -int sysfs_add_file(struct kernfs_node *parent, - const struct attribute *attr, bool is_bin); int sysfs_add_file_mode_ns(struct kernfs_node *parent, const struct attribute *attr, bool is_bin, - umode_t amode, const void *ns); + umode_t amode, kuid_t uid, kgid_t gid, + const void *ns); /* * symlink.c diff --git a/include/linux/kobject.h b/include/linux/kobject.h index 7f6f93c3df9c..b49ff230beba 100644 --- a/include/linux/kobject.h +++ b/include/linux/kobject.h @@ -26,6 +26,7 @@ #include <linux/wait.h> #include <linux/atomic.h> #include <linux/workqueue.h> +#include <linux/uidgid.h> #define UEVENT_HELPER_PATH_LEN 256 #define UEVENT_NUM_ENVP 32 /* number of env pointers */ @@ -114,6 +115,8 @@ extern struct kobject * __must_check kobject_get_unless_zero( extern void kobject_put(struct kobject *kobj); extern const void *kobject_namespace(struct kobject *kobj); +extern void kobject_get_ownership(struct kobject *kobj, + kuid_t *uid, kgid_t *gid); extern char *kobject_get_path(struct kobject *kobj, gfp_t flag); struct kobj_type { @@ -122,6 +125,7 @@ struct kobj_type { struct attribute **default_attrs; const struct kobj_ns_type_operations *(*child_ns_type)(struct kobject *kobj); const void *(*namespace)(struct kobject *kobj); + void (*get_ownership)(struct kobject *kobj, kuid_t *uid, kgid_t *gid); }; struct kobj_uevent_env { diff --git a/lib/kobject.c b/lib/kobject.c index 18989b5b3b56..f2dc1f756007 100644 --- a/lib/kobject.c +++ b/lib/kobject.c @@ -35,6 +35,25 @@ const void *kobject_namespace(struct kobject *kobj) return kobj->ktype->namespace(kobj); } +/** + * kobject_get_ownership - get sysfs ownership data for @kobj + * @kobj: kobject in question + * @uid: kernel user ID for sysfs objects + * @gid: kernel group ID for sysfs objects + * + * Returns initial uid/gid pair that should be used when creating sysfs + * representation of given kobject. Normally used to adjust ownership of + * objects in a container. + */ +void kobject_get_ownership(struct kobject *kobj, kuid_t *uid, kgid_t *gid) +{ + *uid = GLOBAL_ROOT_UID; + *gid = GLOBAL_ROOT_GID; + + if (kobj->ktype->get_ownership) + kobj->ktype->get_ownership(kobj, uid, gid); +} + /* * populate_dir - populate directory with attributes. * @kobj: object we're working on. -- 2.7.4
Tyler Hicks
2018-Jul-20 21:56 UTC
[Bridge] [PATCH net-next v3 3/8] kobject: kset_create_and_add() - fetch ownership info from parent
From: Dmitry Torokhov <dmitry.torokhov at gmail.com> This change implements get_ownership() for ksets created with kset_create_and_add() call by fetching ownership data from parent kobject. This is done mostly for benefit of "queues" attribute of net devices so that corresponding directory belongs to container's root instead of global root for network devices in a container. Signed-off-by: Dmitry Torokhov <dmitry.torokhov at gmail.com> Reviewed-by: Tyler Hicks <tyhicks at canonical.com> --- lib/kobject.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/lib/kobject.c b/lib/kobject.c index f2dc1f756007..389829d3a1d1 100644 --- a/lib/kobject.c +++ b/lib/kobject.c @@ -887,9 +887,16 @@ static void kset_release(struct kobject *kobj) kfree(kset); } +void kset_get_ownership(struct kobject *kobj, kuid_t *uid, kgid_t *gid) +{ + if (kobj->parent) + kobject_get_ownership(kobj->parent, uid, gid); +} + static struct kobj_type kset_ktype = { .sysfs_ops = &kobj_sysfs_ops, - .release = kset_release, + .release = kset_release, + .get_ownership = kset_get_ownership, }; /** -- 2.7.4
Tyler Hicks
2018-Jul-20 21:56 UTC
[Bridge] [PATCH net-next v3 4/8] driver core: set up ownership of class devices in sysfs
From: Dmitry Torokhov <dmitry.torokhov at gmail.com> Plumb in get_ownership() callback for devices belonging to a class so that they can be created with uid/gid different from global root. This will allow network devices in a container to belong to container's root and not global root. Signed-off-by: Dmitry Torokhov <dmitry.torokhov at gmail.com> Reviewed-by: Tyler Hicks <tyhicks at canonical.com> --- drivers/base/core.c | 9 +++++++++ include/linux/device.h | 5 +++++ 2 files changed, 14 insertions(+) diff --git a/drivers/base/core.c b/drivers/base/core.c index df3e1a44707a..276c7e3f754c 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -866,10 +866,19 @@ static const void *device_namespace(struct kobject *kobj) return ns; } +static void device_get_ownership(struct kobject *kobj, kuid_t *uid, kgid_t *gid) +{ + struct device *dev = kobj_to_dev(kobj); + + if (dev->class && dev->class->get_ownership) + dev->class->get_ownership(dev, uid, gid); +} + static struct kobj_type device_ktype = { .release = device_release, .sysfs_ops = &dev_sysfs_ops, .namespace = device_namespace, + .get_ownership = device_get_ownership, }; diff --git a/include/linux/device.h b/include/linux/device.h index 055a69dbcd18..fe6ccb6dc119 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -384,6 +384,9 @@ int subsys_virtual_register(struct bus_type *subsys, * @shutdown_pre: Called at shut-down time before driver shutdown. * @ns_type: Callbacks so sysfs can detemine namespaces. * @namespace: Namespace of the device belongs to this class. + * @get_ownership: Allows class to specify uid/gid of the sysfs directories + * for the devices belonging to the class. Usually tied to + * device's namespace. * @pm: The default device power management operations of this class. * @p: The private data of the driver core, no one other than the * driver core can touch this. @@ -413,6 +416,8 @@ struct class { const struct kobj_ns_type_operations *ns_type; const void *(*namespace)(struct device *dev); + void (*get_ownership)(struct device *dev, kuid_t *uid, kgid_t *gid); + const struct dev_pm_ops *pm; struct subsys_private *p; -- 2.7.4
Tyler Hicks
2018-Jul-20 21:56 UTC
[Bridge] [PATCH net-next v3 5/8] net-sysfs: require net admin in the init ns for setting tx_maxrate
An upcoming change will allow container root to open some /sys/class/net files for writing. The tx_maxrate attribute can result in changes to actual hardware devices so err on the side of caution by requiring CAP_NET_ADMIN in the init namespace in the corresponding attribute store operation. Signed-off-by: Tyler Hicks <tyhicks at canonical.com> --- net/core/net-sysfs.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index ffa1d18f2c2c..405c41ecb20b 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -1087,6 +1087,9 @@ static ssize_t tx_maxrate_store(struct netdev_queue *queue, int err, index = get_netdev_queue_index(queue); u32 rate = 0; + if (!capable(CAP_NET_ADMIN)) + return -EPERM; + err = kstrtou32(buf, 10, &rate); if (err < 0) return err; -- 2.7.4
Tyler Hicks
2018-Jul-20 21:56 UTC
[Bridge] [PATCH net-next v3 6/8] net-sysfs: make sure objects belong to container's owner
From: Dmitry Torokhov <dmitry.torokhov at gmail.com> When creating various objects in /sys/class/net/... make sure that they belong to container's owner instead of global root (if they belong to a container/namespace). Co-Developed-by: Tyler Hicks <tyhicks at canonical.com> Signed-off-by: Dmitry Torokhov <dmitry.torokhov at gmail.com> Signed-off-by: Tyler Hicks <tyhicks at canonical.com> --- net/core/net-sysfs.c | 47 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 46 insertions(+), 1 deletion(-) diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index 405c41ecb20b..ada065fc685e 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -656,6 +656,24 @@ static const struct attribute_group wireless_group = { #define net_class_groups NULL #endif /* CONFIG_SYSFS */ +static void net_ns_get_ownership(const struct net *net, + kuid_t *uid, kgid_t *gid) +{ + if (net) { + kuid_t ns_root_uid = make_kuid(net->user_ns, 0); + kgid_t ns_root_gid = make_kgid(net->user_ns, 0); + + if (uid_valid(ns_root_uid)) + *uid = ns_root_uid; + + if (gid_valid(ns_root_gid)) + *gid = ns_root_gid; + } else { + *uid = GLOBAL_ROOT_UID; + *gid = GLOBAL_ROOT_GID; + } +} + #ifdef CONFIG_SYSFS #define to_rx_queue_attr(_attr) \ container_of(_attr, struct rx_queue_attribute, attr) @@ -905,11 +923,20 @@ static const void *rx_queue_namespace(struct kobject *kobj) return ns; } +static void rx_queue_get_ownership(struct kobject *kobj, + kuid_t *uid, kgid_t *gid) +{ + const struct net *net = rx_queue_namespace(kobj); + + net_ns_get_ownership(net, uid, gid); +} + static struct kobj_type rx_queue_ktype __ro_after_init = { .sysfs_ops = &rx_queue_sysfs_ops, .release = rx_queue_release, .default_attrs = rx_queue_default_attrs, - .namespace = rx_queue_namespace + .namespace = rx_queue_namespace, + .get_ownership = rx_queue_get_ownership, }; static int rx_queue_add_kobject(struct net_device *dev, int index) @@ -1431,11 +1458,20 @@ static const void *netdev_queue_namespace(struct kobject *kobj) return ns; } +static void netdev_queue_get_ownership(struct kobject *kobj, + kuid_t *uid, kgid_t *gid) +{ + const struct net *net = netdev_queue_namespace(kobj); + + net_ns_get_ownership(net, uid, gid); +} + static struct kobj_type netdev_queue_ktype __ro_after_init = { .sysfs_ops = &netdev_queue_sysfs_ops, .release = netdev_queue_release, .default_attrs = netdev_queue_default_attrs, .namespace = netdev_queue_namespace, + .get_ownership = netdev_queue_get_ownership, }; static int netdev_queue_add_kobject(struct net_device *dev, int index) @@ -1625,6 +1661,14 @@ static const void *net_namespace(struct device *d) return dev_net(dev); } +static void net_get_ownership(struct device *d, kuid_t *uid, kgid_t *gid) +{ + struct net_device *dev = to_net_dev(d); + const struct net *net = dev_net(dev); + + net_ns_get_ownership(net, uid, gid); +} + static struct class net_class __ro_after_init = { .name = "net", .dev_release = netdev_release, @@ -1632,6 +1676,7 @@ static struct class net_class __ro_after_init = { .dev_uevent = netdev_uevent, .ns_type = &net_ns_type_operations, .namespace = net_namespace, + .get_ownership = net_get_ownership, }; #ifdef CONFIG_OF_NET -- 2.7.4
Tyler Hicks
2018-Jul-20 21:56 UTC
[Bridge] [PATCH net-next v3 7/8] net: create reusable function for getting ownership info of sysfs inodes
Make net_ns_get_ownership() reusable by networking code outside of core. This is useful, for example, to allow bridge related sysfs files to be owned by container root. Add a function comment since this is a potentially dangerous function to use given the way that kobject_get_ownership() works by initializing uid and gid before calling .get_ownership(). Signed-off-by: Tyler Hicks <tyhicks at canonical.com> --- include/net/net_namespace.h | 10 ++++++++++ net/core/net-sysfs.c | 18 ------------------ net/core/net_namespace.c | 28 ++++++++++++++++++++++++++++ 3 files changed, 38 insertions(+), 18 deletions(-) diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h index a71264d75d7f..9b5fdc50519a 100644 --- a/include/net/net_namespace.h +++ b/include/net/net_namespace.h @@ -10,6 +10,7 @@ #include <linux/workqueue.h> #include <linux/list.h> #include <linux/sysctl.h> +#include <linux/uidgid.h> #include <net/flow.h> #include <net/netns/core.h> @@ -170,6 +171,8 @@ extern struct net init_net; struct net *copy_net_ns(unsigned long flags, struct user_namespace *user_ns, struct net *old_net); +void net_ns_get_ownership(const struct net *net, kuid_t *uid, kgid_t *gid); + void net_ns_barrier(void); #else /* CONFIG_NET_NS */ #include <linux/sched.h> @@ -182,6 +185,13 @@ static inline struct net *copy_net_ns(unsigned long flags, return old_net; } +static inline void net_ns_get_ownership(const struct net *net, + kuid_t *uid, kgid_t *gid) +{ + *uid = GLOBAL_ROOT_UID; + *gid = GLOBAL_ROOT_GID; +} + static inline void net_ns_barrier(void) {} #endif /* CONFIG_NET_NS */ diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index ada065fc685e..0a95bcf64cdc 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -656,24 +656,6 @@ static const struct attribute_group wireless_group = { #define net_class_groups NULL #endif /* CONFIG_SYSFS */ -static void net_ns_get_ownership(const struct net *net, - kuid_t *uid, kgid_t *gid) -{ - if (net) { - kuid_t ns_root_uid = make_kuid(net->user_ns, 0); - kgid_t ns_root_gid = make_kgid(net->user_ns, 0); - - if (uid_valid(ns_root_uid)) - *uid = ns_root_uid; - - if (gid_valid(ns_root_gid)) - *gid = ns_root_gid; - } else { - *uid = GLOBAL_ROOT_UID; - *gid = GLOBAL_ROOT_GID; - } -} - #ifdef CONFIG_SYSFS #define to_rx_queue_attr(_attr) \ container_of(_attr, struct rx_queue_attribute, attr) diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c index a11e03f920d3..738871af5efa 100644 --- a/net/core/net_namespace.c +++ b/net/core/net_namespace.c @@ -17,6 +17,7 @@ #include <linux/user_namespace.h> #include <linux/net_namespace.h> #include <linux/sched/task.h> +#include <linux/uidgid.h> #include <net/sock.h> #include <net/netlink.h> @@ -448,6 +449,33 @@ struct net *copy_net_ns(unsigned long flags, return net; } +/** + * net_ns_get_ownership - get sysfs ownership data for @net + * @net: network namespace in question (can be NULL) + * @uid: kernel user ID for sysfs objects + * @gid: kernel group ID for sysfs objects + * + * Returns the uid/gid pair of root in the user namespace associated with the + * given network namespace. + */ +void net_ns_get_ownership(const struct net *net, kuid_t *uid, kgid_t *gid) +{ + if (net) { + kuid_t ns_root_uid = make_kuid(net->user_ns, 0); + kgid_t ns_root_gid = make_kgid(net->user_ns, 0); + + if (uid_valid(ns_root_uid)) + *uid = ns_root_uid; + + if (gid_valid(ns_root_gid)) + *gid = ns_root_gid; + } else { + *uid = GLOBAL_ROOT_UID; + *gid = GLOBAL_ROOT_GID; + } +} +EXPORT_SYMBOL_GPL(net_ns_get_ownership); + static void unhash_nsid(struct net *net, struct net *last) { struct net *tmp; -- 2.7.4
Tyler Hicks
2018-Jul-20 21:56 UTC
[Bridge] [PATCH net-next v3 8/8] bridge: make sure objects belong to container's owner
When creating various bridge objects in /sys/class/net/... make sure that they belong to the container's owner instead of global root (if they belong to a container/namespace). Signed-off-by: Tyler Hicks <tyhicks at canonical.com> --- net/bridge/br_if.c | 9 +++++++++ net/bridge/br_private.h | 2 ++ net/bridge/br_sysfs_if.c | 5 ++--- 3 files changed, 13 insertions(+), 3 deletions(-) diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c index 05e42d86882d..e7c8d55212aa 100644 --- a/net/bridge/br_if.c +++ b/net/bridge/br_if.c @@ -26,6 +26,7 @@ #include <net/sock.h> #include <linux/if_vlan.h> #include <net/switchdev.h> +#include <net/net_namespace.h> #include "br_private.h" @@ -204,11 +205,19 @@ static void release_nbp(struct kobject *kobj) kfree(p); } +static void brport_get_ownership(struct kobject *kobj, kuid_t *uid, kgid_t *gid) +{ + struct net_bridge_port *p = kobj_to_brport(kobj); + + net_ns_get_ownership(dev_net(p->dev), uid, gid); +} + static struct kobj_type brport_ktype = { #ifdef CONFIG_SYSFS .sysfs_ops = &brport_sysfs_ops, #endif .release = release_nbp, + .get_ownership = brport_get_ownership, }; static void destroy_nbp(struct net_bridge_port *p) diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 5216a524b537..cf0005d2a4d0 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -283,6 +283,8 @@ struct net_bridge_port { u16 group_fwd_mask; }; +#define kobj_to_brport(obj) container_of(obj, struct net_bridge_port, kobj) + #define br_auto_port(p) ((p)->flags & BR_AUTO_MASK) #define br_promisc_port(p) ((p)->flags & BR_PROMISC) diff --git a/net/bridge/br_sysfs_if.c b/net/bridge/br_sysfs_if.c index f99c5bf5c906..ab4c7f8adf68 100644 --- a/net/bridge/br_sysfs_if.c +++ b/net/bridge/br_sysfs_if.c @@ -249,13 +249,12 @@ static const struct brport_attribute *brport_attrs[] = { }; #define to_brport_attr(_at) container_of(_at, struct brport_attribute, attr) -#define to_brport(obj) container_of(obj, struct net_bridge_port, kobj) static ssize_t brport_show(struct kobject *kobj, struct attribute *attr, char *buf) { struct brport_attribute *brport_attr = to_brport_attr(attr); - struct net_bridge_port *p = to_brport(kobj); + struct net_bridge_port *p = kobj_to_brport(kobj); if (!brport_attr->show) return -EINVAL; @@ -268,7 +267,7 @@ static ssize_t brport_store(struct kobject *kobj, const char *buf, size_t count) { struct brport_attribute *brport_attr = to_brport_attr(attr); - struct net_bridge_port *p = to_brport(kobj); + struct net_bridge_port *p = kobj_to_brport(kobj); ssize_t ret = -EINVAL; char *endp; unsigned long val; -- 2.7.4
David Miller
2018-Jul-21 06:45 UTC
[Bridge] [PATCH net-next v3 0/8] Make /sys/class/net per net namespace objects belong to container
From: Tyler Hicks <tyhicks at canonical.com> Date: Fri, 20 Jul 2018 21:56:46 +0000> This is a revival of an older patch set from Dmitry Torokhov: > > https://lore.kernel.org/lkml/1471386795-32918-1-git-send-email-dmitry.torokhov at gmail.com/ > > My submission of v2 is here: > > https://lore.kernel.org/lkml/1531497949-1766-1-git-send-email-tyhicks at canonical.com/ > > Here's Dmitry's description:...> * Changes since v2:... Series applied, let's see how the build goes this time :-)