Joel Granados
2023-Jul-31 07:17 UTC
[Bridge] [PATCH v2 00/14] sysctl: Add a size argument to register functions in sysctl
Why? This is a preparation patch set that will make it easier for us to apply subsequent patches that will remove the sentinel element (last empty element) in the ctl_table arrays. In itself, it does not remove any sentinels but it is needed to bring all the advantages of the removal to fruition which is to help reduce the overall build time size of the kernel and run time memory bloat by about ~64 bytes per sentinel. Without this patch set we would have to put everything into one big commit making the review process that much longer and harder for everyone. Since it is so related to the removal of the sentinel element, its worth while to give a bit of context on this: * Good summary from Luis about why we want to remove the sentinels. https://lore.kernel.org/all/ZMFizKFkVxUFtSqa at bombadil.infradead.org/ * This is a patch set that replaces register_sysctl_table with register_sysctl https://lore.kernel.org/all/20230302204612.782387-1-mcgrof at kernel.org/ * Patch set to deprecate register_sysctl_paths() https://lore.kernel.org/all/20230302202826.776286-1-mcgrof at kernel.org/ * Here there is an explicit expectation for the removal of the sentinel element. https://lore.kernel.org/all/20230321130908.6972-1-frank.li at vivo.com * The "ARRAY_SIZE" approach was mentioned (proposed?) in this thread https://lore.kernel.org/all/20220220060626.15885-1-tangmeng at uniontech.com What? These commits set things up so we can start removing the sentinel elements. They modify sysctl and net_sysctl internals so that registering a ctl_table that contains a sentinel gives the same result as passing a table_size calculated from the ctl_table array without a sentinel. We accomplish this by introducing a table_size argument in the same place where procname is checked for NULL. The idea is for it to keep stopping when it hits ->procname == NULL, while the sentinel is still present. And when the sentinel is removed, it will stop on the table_size (thx to jani.nikula at linux.intel.com for the discussion that led to this). This allows us to remove sentinels from one (or several) files at a time. These commits are part of a bigger set containing the removal of ctl_table sentinel (https://github.com/Joelgranados/linux/tree/tag/sysctl_remove_empty_elem_V2). The idea is to make the review process easier by chunking the 65+ commits into manageable pieces. My idea is to send out one chunk at a time so it can be reviewed separately from the others without the noise from parallel related sets. After this first chunk will come 6 that remove the sentinel element from "arch/*, drivers/*, fs/*, kernel/*, net/* and miscellaneous. And then a final one that removes the ->procname == NULL check. You can see all commits here (https://github.com/Joelgranados/linux/tree/tag/sysctl_remove_empty_elem_V2). Commits in this chunk: * Preparation commits: start : sysctl: Prefer ctl_table_header in proc_sysct end : sysctl: Add size argument to init_header These are preparation commits that make sure that we have the ctl_table_header where we need the array size. * Add size to __register_sysctl_table, __register_sysctl_init and register_sysctl start : sysctl: Add a size arg to __register_sysctl_table end : sysctl: Add size arg to __register_sysctl_init Here we replace the existing register functions with macros that add the ARRAY_SIZE automatically. Unfortunately these macros cannot be used for the register calls that pass a pointer; in this situation we add register functions with an table_size argument (thx to greg at kroah.com for bringing this to my attention) * Add size to register_net_sysctl start : sysctl: Add size to register_net_sysctl function end : sysctl: SIZE_MAX->ARRAY_SIZE in register_net_sysctl register_net_sysctl is an indirection function to the sysctl registrations and needed a several commits to add table_size to all its callers. We temporarily use SIZE_MAX to avoid compiler warnings while we change to register_net_sysctl to register_net_sysctl_sz; we remove it with the penultimate patch of this set. Finally, we make sure to adjust the calculated size every time there is a check for unprivileged users. * Add size as additional stopping criteria commit : sysctl: Use ctl_table_size as stopping criteria for list macro We add table_size check in the main macro within proc_sysctl.c. This commit allows the removal of the sentinel element by chunks. Testing: * Ran sysctl selftests (./tools/testing/selftests/sysctl/sysctl.sh) * Successfully ran this through 0-day Size saving estimates: A consequence of eventually removing all the sentinels (64 bytes per sentinel) is the bytes we save. These are *not* numbers that we will get after this patch set; these are the numbers that we will get after removing all the sentinels. I included them here because they are relevant and to get an idea of just how much memory we are talking about. * bloat-o-meter: The "yesall" configuration results save 9158 bytes (you can see the output here https://lore.kernel.org/all/20230621091000.424843-1-j.granados at samsung.com/. The "tiny" configuration + CONFIG_SYSCTL save 1215 bytes (you can see the output here [2]) * memory usage: As we no longer need the sentinel element within proc_sysctl.c, we save some bytes in main memory as well. In my testing kernel I measured a difference of 6720 bytes. I include the way to measure this in [1] Comments/feedback greatly appreciated V2: * Dropped moving mpls_table up the af_mpls.c file. We don't need it any longer as it is not really used before its current location. * Added/Clarified the why in several commit messages that were missing it. * Clarified the why in the cover letter to be "to make it easier to apply subsequent patches that will remove the sentinels" * Added documentation for table_size * Added suggested by tags (Greg and Jani) to relevant commits Best Joel [1] To measure the in memory savings apply this patch on top of https://github.com/Joelgranados/linux/tree/tag/sysctl_remove_empty_elem_V1 " diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index 5f413bfd6271..9aa8374c0ef1 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -975,6 +975,7 @@ static struct ctl_dir *new_dir(struct ctl_table_set *set, table[0].procname = new_name; table[0].mode = S_IFDIR|S_IRUGO|S_IXUGO; init_header(&new->header, set->dir.header.root, set, node, table, 1); + printk("%ld sysctl saved mem kzalloc \n", sizeof(struct ctl_table)); return new; } @@ -1202,6 +1203,7 @@ static struct ctl_table_header *new_links(struct ctl_dir *dir, struct ctl_table_ head->ctl_table_size); links->nreg = head->ctl_table_size; + printk("%ld sysctl saved mem kzalloc \n", sizeof(struct ctl_table)); return links; } " and then run the following bash script in the kernel: accum=0 for n in $(dmesg | grep kzalloc | awk '{print $3}') ; do echo $n accum=$(calc "$accum + $n") done echo $accum [2] bloat-o-meter with "tiny" config: add/remove: 0/2 grow/shrink: 33/24 up/down: 470/-1685 (-1215) Function old new delta insert_header 831 966 +135 __register_sysctl_table 971 1092 +121 get_links 177 226 +49 put_links 167 186 +19 erase_header 55 66 +11 sysctl_init_bases 59 69 +10 setup_sysctl_set 65 73 +8 utsname_sysctl_init 26 31 +5 sld_mitigate_sysctl_init 33 38 +5 setup_userns_sysctls 158 163 +5 sched_rt_sysctl_init 33 38 +5 sched_fair_sysctl_init 33 38 +5 sched_dl_sysctl_init 33 38 +5 random_sysctls_init 33 38 +5 page_writeback_init 122 127 +5 oom_init 73 78 +5 kernel_panic_sysctls_init 33 38 +5 kernel_exit_sysctls_init 33 38 +5 init_umh_sysctls 33 38 +5 init_signal_sysctls 33 38 +5 init_pipe_fs 94 99 +5 init_fs_sysctls 33 38 +5 init_fs_stat_sysctls 33 38 +5 init_fs_namespace_sysctls 33 38 +5 init_fs_namei_sysctls 33 38 +5 init_fs_inode_sysctls 33 38 +5 init_fs_exec_sysctls 33 38 +5 init_fs_dcache_sysctls 33 38 +5 register_sysctl 22 25 +3 __register_sysctl_init 9 12 +3 user_namespace_sysctl_init 149 151 +2 sched_core_sysctl_init 38 40 +2 register_sysctl_mount_point 13 15 +2 vm_table 1344 1280 -64 vm_page_writeback_sysctls 512 448 -64 vm_oom_kill_table 256 192 -64 uts_kern_table 448 384 -64 usermodehelper_table 192 128 -64 user_table 576 512 -64 sld_sysctls 128 64 -64 signal_debug_table 128 64 -64 sched_rt_sysctls 256 192 -64 sched_fair_sysctls 128 64 -64 sched_dl_sysctls 192 128 -64 sched_core_sysctls 64 - -64 root_table 128 64 -64 random_table 448 384 -64 namei_sysctls 320 256 -64 kern_table 1792 1728 -64 kern_panic_table 128 64 -64 kern_exit_table 128 64 -64 inodes_sysctls 192 128 -64 fs_stat_sysctls 256 192 -64 fs_shared_sysctls 192 128 -64 fs_pipe_sysctls 256 192 -64 fs_namespace_sysctls 128 64 -64 fs_exec_sysctls 128 64 -64 fs_dcache_sysctls 128 64 -64 init_header 85 - -85 Total: Before=1877669, After=1876454, chg -0.06% base: fdf0eaf11452 Joel Granados (14): sysctl: Prefer ctl_table_header in proc_sysctl sysctl: Use ctl_table_header in list_for_each_table_entry sysctl: Add ctl_table_size to ctl_table_header sysctl: Add size argument to init_header sysctl: Add a size arg to __register_sysctl_table sysctl: Add size to register_sysctl sysctl: Add size arg to __register_sysctl_init sysctl: Add size to register_net_sysctl function ax.25: Update to register_net_sysctl_sz netfilter: Update to register_net_sysctl_sz networking: Update to register_net_sysctl_sz vrf: Update to register_net_sysctl_sz sysctl: SIZE_MAX->ARRAY_SIZE in register_net_sysctl sysctl: Use ctl_table_size as stopping criteria for list macro arch/arm64/kernel/armv8_deprecated.c | 2 +- arch/s390/appldata/appldata_base.c | 2 +- drivers/net/vrf.c | 3 +- fs/proc/proc_sysctl.c | 90 +++++++++++++------------ include/linux/sysctl.h | 31 +++++++-- include/net/ipv6.h | 2 + include/net/net_namespace.h | 10 +-- ipc/ipc_sysctl.c | 4 +- ipc/mq_sysctl.c | 4 +- kernel/ucount.c | 5 +- net/ax25/sysctl_net_ax25.c | 3 +- net/bridge/br_netfilter_hooks.c | 3 +- net/core/neighbour.c | 8 ++- net/core/sysctl_net_core.c | 3 +- net/ieee802154/6lowpan/reassembly.c | 8 ++- net/ipv4/devinet.c | 3 +- net/ipv4/ip_fragment.c | 3 +- net/ipv4/route.c | 8 ++- net/ipv4/sysctl_net_ipv4.c | 3 +- net/ipv4/xfrm4_policy.c | 3 +- net/ipv6/addrconf.c | 3 +- net/ipv6/icmp.c | 5 ++ net/ipv6/netfilter/nf_conntrack_reasm.c | 3 +- net/ipv6/reassembly.c | 3 +- net/ipv6/route.c | 13 ++-- net/ipv6/sysctl_net_ipv6.c | 16 +++-- net/ipv6/xfrm6_policy.c | 3 +- net/mpls/af_mpls.c | 6 +- net/mptcp/ctrl.c | 3 +- net/netfilter/ipvs/ip_vs_ctl.c | 8 ++- net/netfilter/ipvs/ip_vs_lblc.c | 10 ++- net/netfilter/ipvs/ip_vs_lblcr.c | 10 ++- net/netfilter/nf_conntrack_standalone.c | 4 +- net/netfilter/nf_log.c | 7 +- net/rds/tcp.c | 3 +- net/sctp/sysctl.c | 4 +- net/smc/smc_sysctl.c | 3 +- net/sysctl_net.c | 26 ++++--- net/unix/sysctl_net_unix.c | 3 +- net/xfrm/xfrm_sysctl.c | 8 ++- 40 files changed, 222 insertions(+), 117 deletions(-) -- 2.30.2
Joel Granados
2023-Jul-31 07:17 UTC
[Bridge] [PATCH v2 01/14] sysctl: Prefer ctl_table_header in proc_sysctl
This is a preparation commit that replaces ctl_table with ctl_table_header as the pointer that is passed around in proc_sysctl.c. This will become necessary in subsequent commits when the size of the ctl_table array can no longer be calculated by searching for an empty sentinel (last empty ctl_table element) but will be carried along inside the ctl_table_header struct. Signed-off-by: Joel Granados <j.granados at samsung.com> --- fs/proc/proc_sysctl.c | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-) diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index 5ea42653126e..94d71446da39 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -1125,11 +1125,11 @@ static int sysctl_check_table_array(const char *path, struct ctl_table *table) return err; } -static int sysctl_check_table(const char *path, struct ctl_table *table) +static int sysctl_check_table(const char *path, struct ctl_table_header *header) { struct ctl_table *entry; int err = 0; - list_for_each_table_entry(entry, table) { + list_for_each_table_entry(entry, header->ctl_table) { if ((entry->proc_handler == proc_dostring) || (entry->proc_handler == proc_dobool) || (entry->proc_handler == proc_dointvec) || @@ -1159,8 +1159,7 @@ static int sysctl_check_table(const char *path, struct ctl_table *table) return err; } -static struct ctl_table_header *new_links(struct ctl_dir *dir, struct ctl_table *table, - struct ctl_table_root *link_root) +static struct ctl_table_header *new_links(struct ctl_dir *dir, struct ctl_table_header *head) { struct ctl_table *link_table, *entry, *link; struct ctl_table_header *links; @@ -1170,7 +1169,7 @@ static struct ctl_table_header *new_links(struct ctl_dir *dir, struct ctl_table name_bytes = 0; nr_entries = 0; - list_for_each_table_entry(entry, table) { + list_for_each_table_entry(entry, head->ctl_table) { nr_entries++; name_bytes += strlen(entry->procname) + 1; } @@ -1189,12 +1188,12 @@ static struct ctl_table_header *new_links(struct ctl_dir *dir, struct ctl_table link_name = (char *)&link_table[nr_entries + 1]; link = link_table; - list_for_each_table_entry(entry, table) { + list_for_each_table_entry(entry, head->ctl_table) { int len = strlen(entry->procname) + 1; memcpy(link_name, entry->procname, len); link->procname = link_name; link->mode = S_IFLNK|S_IRWXUGO; - link->data = link_root; + link->data = head->root; link_name += len; link++; } @@ -1205,15 +1204,16 @@ static struct ctl_table_header *new_links(struct ctl_dir *dir, struct ctl_table } static bool get_links(struct ctl_dir *dir, - struct ctl_table *table, struct ctl_table_root *link_root) + struct ctl_table_header *header, + struct ctl_table_root *link_root) { - struct ctl_table_header *head; + struct ctl_table_header *tmp_head; struct ctl_table *entry, *link; /* Are there links available for every entry in table? */ - list_for_each_table_entry(entry, table) { + list_for_each_table_entry(entry, header->ctl_table) { const char *procname = entry->procname; - link = find_entry(&head, dir, procname, strlen(procname)); + link = find_entry(&tmp_head, dir, procname, strlen(procname)); if (!link) return false; if (S_ISDIR(link->mode) && S_ISDIR(entry->mode)) @@ -1224,10 +1224,10 @@ static bool get_links(struct ctl_dir *dir, } /* The checks passed. Increase the registration count on the links */ - list_for_each_table_entry(entry, table) { + list_for_each_table_entry(entry, header->ctl_table) { const char *procname = entry->procname; - link = find_entry(&head, dir, procname, strlen(procname)); - head->nreg++; + link = find_entry(&tmp_head, dir, procname, strlen(procname)); + tmp_head->nreg++; } return true; } @@ -1246,13 +1246,13 @@ static int insert_links(struct ctl_table_header *head) if (IS_ERR(core_parent)) return 0; - if (get_links(core_parent, head->ctl_table, head->root)) + if (get_links(core_parent, head, head->root)) return 0; core_parent->header.nreg++; spin_unlock(&sysctl_lock); - links = new_links(core_parent, head->ctl_table, head->root); + links = new_links(core_parent, head); spin_lock(&sysctl_lock); err = -ENOMEM; @@ -1260,7 +1260,7 @@ static int insert_links(struct ctl_table_header *head) goto out; err = 0; - if (get_links(core_parent, head->ctl_table, head->root)) { + if (get_links(core_parent, head, head->root)) { kfree(links); goto out; } @@ -1371,7 +1371,7 @@ struct ctl_table_header *__register_sysctl_table( node = (struct ctl_node *)(header + 1); init_header(header, root, set, node, table); - if (sysctl_check_table(path, table)) + if (sysctl_check_table(path, header)) goto fail; spin_lock(&sysctl_lock); -- 2.30.2
Joel Granados
2023-Jul-31 07:17 UTC
[Bridge] [PATCH v2 02/14] sysctl: Use ctl_table_header in list_for_each_table_entry
We replace the ctl_table with the ctl_table_header pointer in list_for_each_table_entry which is the macro responsible for traversing the ctl_table arrays. This is a preparation commit that will make it easier to add the ctl_table array size (that will be added to ctl_table_header in subsequent commits) to the already existing loop logic based on empty ctl_table elements (so called sentinels). Signed-off-by: Joel Granados <j.granados at samsung.com> --- fs/proc/proc_sysctl.c | 26 ++++++++++++++------------ 1 file changed, 14 insertions(+), 12 deletions(-) diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index 94d71446da39..884460b0385b 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -19,8 +19,8 @@ #include <linux/kmemleak.h> #include "internal.h" -#define list_for_each_table_entry(entry, table) \ - for ((entry) = (table); (entry)->procname; (entry)++) +#define list_for_each_table_entry(entry, header) \ + for ((entry) = (header->ctl_table); (entry)->procname; (entry)++) static const struct dentry_operations proc_sys_dentry_operations; static const struct file_operations proc_sys_file_operations; @@ -204,7 +204,7 @@ static void init_header(struct ctl_table_header *head, if (node) { struct ctl_table *entry; - list_for_each_table_entry(entry, table) { + list_for_each_table_entry(entry, head) { node->header = head; node++; } @@ -215,7 +215,7 @@ static void erase_header(struct ctl_table_header *head) { struct ctl_table *entry; - list_for_each_table_entry(entry, head->ctl_table) + list_for_each_table_entry(entry, head) erase_entry(head, entry); } @@ -242,7 +242,7 @@ static int insert_header(struct ctl_dir *dir, struct ctl_table_header *header) err = insert_links(header); if (err) goto fail_links; - list_for_each_table_entry(entry, header->ctl_table) { + list_for_each_table_entry(entry, header) { err = insert_entry(header, entry); if (err) goto fail; @@ -1129,7 +1129,7 @@ static int sysctl_check_table(const char *path, struct ctl_table_header *header) { struct ctl_table *entry; int err = 0; - list_for_each_table_entry(entry, header->ctl_table) { + list_for_each_table_entry(entry, header) { if ((entry->proc_handler == proc_dostring) || (entry->proc_handler == proc_dobool) || (entry->proc_handler == proc_dointvec) || @@ -1169,7 +1169,7 @@ static struct ctl_table_header *new_links(struct ctl_dir *dir, struct ctl_table_ name_bytes = 0; nr_entries = 0; - list_for_each_table_entry(entry, head->ctl_table) { + list_for_each_table_entry(entry, head) { nr_entries++; name_bytes += strlen(entry->procname) + 1; } @@ -1188,7 +1188,7 @@ static struct ctl_table_header *new_links(struct ctl_dir *dir, struct ctl_table_ link_name = (char *)&link_table[nr_entries + 1]; link = link_table; - list_for_each_table_entry(entry, head->ctl_table) { + list_for_each_table_entry(entry, head) { int len = strlen(entry->procname) + 1; memcpy(link_name, entry->procname, len); link->procname = link_name; @@ -1211,7 +1211,7 @@ static bool get_links(struct ctl_dir *dir, struct ctl_table *entry, *link; /* Are there links available for every entry in table? */ - list_for_each_table_entry(entry, header->ctl_table) { + list_for_each_table_entry(entry, header) { const char *procname = entry->procname; link = find_entry(&tmp_head, dir, procname, strlen(procname)); if (!link) @@ -1224,7 +1224,7 @@ static bool get_links(struct ctl_dir *dir, } /* The checks passed. Increase the registration count on the links */ - list_for_each_table_entry(entry, header->ctl_table) { + list_for_each_table_entry(entry, header) { const char *procname = entry->procname; link = find_entry(&tmp_head, dir, procname, strlen(procname)); tmp_head->nreg++; @@ -1356,12 +1356,14 @@ struct ctl_table_header *__register_sysctl_table( { struct ctl_table_root *root = set->dir.header.root; struct ctl_table_header *header; + struct ctl_table_header h_tmp; struct ctl_dir *dir; struct ctl_table *entry; struct ctl_node *node; int nr_entries = 0; - list_for_each_table_entry(entry, table) + h_tmp.ctl_table = table; + list_for_each_table_entry(entry, (&h_tmp)) nr_entries++; header = kzalloc(sizeof(struct ctl_table_header) + @@ -1471,7 +1473,7 @@ static void put_links(struct ctl_table_header *header) if (IS_ERR(core_parent)) return; - list_for_each_table_entry(entry, header->ctl_table) { + list_for_each_table_entry(entry, header) { struct ctl_table_header *link_head; struct ctl_table *link; const char *name = entry->procname; -- 2.30.2
Joel Granados
2023-Jul-31 07:17 UTC
[Bridge] [PATCH v2 03/14] sysctl: Add ctl_table_size to ctl_table_header
The new ctl_table_size element will hold the size of the ctl_table arrays contained in the ctl_table_header. This value should eventually be passed by the callers to the sysctl register infrastructure. And while this commit introduces the variable, it does not set nor use it because that requires case by case considerations for each caller. It provides two important things: (1) A place to put the result of the ctl_table array calculation when it gets introduced for each caller. And (2) the size that will be used as the additional stopping criteria in the list_for_each_table_entry macro (to be added when all the callers are migrated) Signed-off-by: Joel Granados <j.granados at samsung.com> --- include/linux/sysctl.h | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h index 59d451f455bf..33252ad58ebe 100644 --- a/include/linux/sysctl.h +++ b/include/linux/sysctl.h @@ -159,12 +159,22 @@ struct ctl_node { struct ctl_table_header *header; }; -/* struct ctl_table_header is used to maintain dynamic lists of - struct ctl_table trees. */ +/** + * struct ctl_table_header - maintains dynamic lists of struct ctl_table trees + * @ctl_table: pointer to the first element in ctl_table array + * @ctl_table_size: number of elements pointed by @ctl_table + * @used: The entry will never be touched when equal to 0. + * @count: Upped every time something is added to @inodes and downed every time + * something is removed from inodes + * @nreg: When nreg drops to 0 the ctl_table_header will be unregistered. + * @rcu: Delays the freeing of the inode. Introduced with "unfuck proc_sysctl ->d_compare()" + * + */ struct ctl_table_header { union { struct { struct ctl_table *ctl_table; + int ctl_table_size; int used; int count; int nreg; -- 2.30.2
Joel Granados
2023-Jul-31 07:17 UTC
[Bridge] [PATCH v2 04/14] sysctl: Add size argument to init_header
In this commit, we add a table_size argument to the init_header function in order to initialize the ctl_table_size variable in ctl_table_header. Even though the size is not yet used, it is now initialized within the sysctl subsys. We need this commit for when we start adding the table_size arguments to the sysctl functions (e.g. register_sysctl, __register_sysctl_table and __register_sysctl_init). Note that in __register_sysctl_table we temporarily use a calculated size until we add the size argument to that function in subsequent commits. Signed-off-by: Joel Granados <j.granados at samsung.com> --- fs/proc/proc_sysctl.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index 884460b0385b..fa1438f1a355 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -188,9 +188,10 @@ static void erase_entry(struct ctl_table_header *head, struct ctl_table *entry) static void init_header(struct ctl_table_header *head, struct ctl_table_root *root, struct ctl_table_set *set, - struct ctl_node *node, struct ctl_table *table) + struct ctl_node *node, struct ctl_table *table, size_t table_size) { head->ctl_table = table; + head->ctl_table_size = table_size; head->ctl_table_arg = table; head->used = 0; head->count = 1; @@ -973,7 +974,7 @@ static struct ctl_dir *new_dir(struct ctl_table_set *set, memcpy(new_name, name, namelen); table[0].procname = new_name; table[0].mode = S_IFDIR|S_IRUGO|S_IXUGO; - init_header(&new->header, set->dir.header.root, set, node, table); + init_header(&new->header, set->dir.header.root, set, node, table, 1); return new; } @@ -1197,7 +1198,8 @@ static struct ctl_table_header *new_links(struct ctl_dir *dir, struct ctl_table_ link_name += len; link++; } - init_header(links, dir->header.root, dir->header.set, node, link_table); + init_header(links, dir->header.root, dir->header.set, node, link_table, + head->ctl_table_size); links->nreg = nr_entries; return links; @@ -1372,7 +1374,7 @@ struct ctl_table_header *__register_sysctl_table( return NULL; node = (struct ctl_node *)(header + 1); - init_header(header, root, set, node, table); + init_header(header, root, set, node, table, nr_entries); if (sysctl_check_table(path, header)) goto fail; @@ -1537,7 +1539,7 @@ void setup_sysctl_set(struct ctl_table_set *set, { memset(set, 0, sizeof(*set)); set->is_seen = is_seen; - init_header(&set->dir.header, root, set, NULL, root_table); + init_header(&set->dir.header, root, set, NULL, root_table, 1); } void retire_sysctl_set(struct ctl_table_set *set) -- 2.30.2
Joel Granados
2023-Jul-31 07:17 UTC
[Bridge] [PATCH v2 05/14] sysctl: Add a size arg to __register_sysctl_table
We make these changes in order to prepare __register_sysctl_table and its callers for when we remove the sentinel element (empty element at the end of ctl_table arrays). We don't actually remove any sentinels in this commit, but we *do* make sure to use ARRAY_SIZE so the table_size is available when the removal occurs. We add a table_size argument to __register_sysctl_table and adjust callers, all of which pass ctl_table pointers and need an explicit call to ARRAY_SIZE. We implement a size calculation in register_net_sysctl in order to forward the size of the array pointer received from the network register calls. The new table_size argument does not yet have any effect in the init_header call which is still dependent on the sentinel's presence. table_size *does* however drive the `kzalloc` allocation in __register_sysctl_table with no adverse effects as the allocated memory is either one element greater than the calculated ctl_table array (for the calls in ipc_sysctl.c, mq_sysctl.c and ucount.c) or the exact size of the calculated ctl_table array (for the call from sysctl_net.c and register_sysctl). This approach will allows us to "just" remove the sentinel without further changes to __register_sysctl_table as table_size will represent the exact size for all the callers at that point. Signed-off-by: Joel Granados <j.granados at samsung.com> --- fs/proc/proc_sysctl.c | 23 ++++++++++++----------- include/linux/sysctl.h | 2 +- ipc/ipc_sysctl.c | 4 +++- ipc/mq_sysctl.c | 4 +++- kernel/ucount.c | 3 ++- net/sysctl_net.c | 8 +++++++- 6 files changed, 28 insertions(+), 16 deletions(-) diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index fa1438f1a355..b8dd78e344ff 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -1312,6 +1312,7 @@ static struct ctl_dir *sysctl_mkdir_p(struct ctl_dir *dir, const char *path) * should not be free'd after registration. So it should not be * used on stack. It can either be a global or dynamically allocated * by the caller and free'd later after sysctl unregistration. + * @table_size : The number of elements in table * * Register a sysctl table hierarchy. @table should be a filled in ctl_table * array. A completely 0 filled entry terminates the table. @@ -1354,27 +1355,20 @@ static struct ctl_dir *sysctl_mkdir_p(struct ctl_dir *dir, const char *path) */ struct ctl_table_header *__register_sysctl_table( struct ctl_table_set *set, - const char *path, struct ctl_table *table) + const char *path, struct ctl_table *table, size_t table_size) { struct ctl_table_root *root = set->dir.header.root; struct ctl_table_header *header; - struct ctl_table_header h_tmp; struct ctl_dir *dir; - struct ctl_table *entry; struct ctl_node *node; - int nr_entries = 0; - - h_tmp.ctl_table = table; - list_for_each_table_entry(entry, (&h_tmp)) - nr_entries++; header = kzalloc(sizeof(struct ctl_table_header) + - sizeof(struct ctl_node)*nr_entries, GFP_KERNEL_ACCOUNT); + sizeof(struct ctl_node)*table_size, GFP_KERNEL_ACCOUNT); if (!header) return NULL; node = (struct ctl_node *)(header + 1); - init_header(header, root, set, node, table, nr_entries); + init_header(header, root, set, node, table, table_size); if (sysctl_check_table(path, header)) goto fail; @@ -1423,8 +1417,15 @@ struct ctl_table_header *__register_sysctl_table( */ struct ctl_table_header *register_sysctl(const char *path, struct ctl_table *table) { + int count = 0; + struct ctl_table *entry; + struct ctl_table_header t_hdr; + + t_hdr.ctl_table = table; + list_for_each_table_entry(entry, (&t_hdr)) + count++; return __register_sysctl_table(&sysctl_table_root.default_set, - path, table); + path, table, count); } EXPORT_SYMBOL(register_sysctl); diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h index 33252ad58ebe..0495c858989f 100644 --- a/include/linux/sysctl.h +++ b/include/linux/sysctl.h @@ -226,7 +226,7 @@ extern void retire_sysctl_set(struct ctl_table_set *set); struct ctl_table_header *__register_sysctl_table( struct ctl_table_set *set, - const char *path, struct ctl_table *table); + const char *path, struct ctl_table *table, size_t table_size); struct ctl_table_header *register_sysctl(const char *path, struct ctl_table *table); void unregister_sysctl_table(struct ctl_table_header * table); diff --git a/ipc/ipc_sysctl.c b/ipc/ipc_sysctl.c index ef313ecfb53a..8c62e443f78b 100644 --- a/ipc/ipc_sysctl.c +++ b/ipc/ipc_sysctl.c @@ -259,7 +259,9 @@ bool setup_ipc_sysctls(struct ipc_namespace *ns) tbl[i].data = NULL; } - ns->ipc_sysctls = __register_sysctl_table(&ns->ipc_set, "kernel", tbl); + ns->ipc_sysctls = __register_sysctl_table(&ns->ipc_set, + "kernel", tbl, + ARRAY_SIZE(ipc_sysctls)); } if (!ns->ipc_sysctls) { kfree(tbl); diff --git a/ipc/mq_sysctl.c b/ipc/mq_sysctl.c index fbf6a8b93a26..ebb5ed81c151 100644 --- a/ipc/mq_sysctl.c +++ b/ipc/mq_sysctl.c @@ -109,7 +109,9 @@ bool setup_mq_sysctls(struct ipc_namespace *ns) tbl[i].data = NULL; } - ns->mq_sysctls = __register_sysctl_table(&ns->mq_set, "fs/mqueue", tbl); + ns->mq_sysctls = __register_sysctl_table(&ns->mq_set, + "fs/mqueue", tbl, + ARRAY_SIZE(mq_sysctls)); } if (!ns->mq_sysctls) { kfree(tbl); diff --git a/kernel/ucount.c b/kernel/ucount.c index ee8e57fd6f90..2b80264bb79f 100644 --- a/kernel/ucount.c +++ b/kernel/ucount.c @@ -104,7 +104,8 @@ bool setup_userns_sysctls(struct user_namespace *ns) for (i = 0; i < UCOUNT_COUNTS; i++) { tbl[i].data = &ns->ucount_max[i]; } - ns->sysctls = __register_sysctl_table(&ns->set, "user", tbl); + ns->sysctls = __register_sysctl_table(&ns->set, "user", tbl, + ARRAY_SIZE(user_table)); } if (!ns->sysctls) { kfree(tbl); diff --git a/net/sysctl_net.c b/net/sysctl_net.c index 4b45ed631eb8..8ee4b74bc009 100644 --- a/net/sysctl_net.c +++ b/net/sysctl_net.c @@ -163,10 +163,16 @@ static void ensure_safe_net_sysctl(struct net *net, const char *path, struct ctl_table_header *register_net_sysctl(struct net *net, const char *path, struct ctl_table *table) { + int count = 0; + struct ctl_table *entry; + if (!net_eq(net, &init_net)) ensure_safe_net_sysctl(net, path, table); - return __register_sysctl_table(&net->sysctls, path, table); + for (entry = table; entry->procname; entry++) + count++; + + return __register_sysctl_table(&net->sysctls, path, table, count); } EXPORT_SYMBOL_GPL(register_net_sysctl); -- 2.30.2
Joel Granados
2023-Jul-31 07:17 UTC
[Bridge] [PATCH v2 06/14] sysctl: Add size to register_sysctl
This commit adds table_size to register_sysctl in preparation for the removal of the sentinel elements in the ctl_table arrays (last empty markers). And though we do *not* remove any sentinels in this commit, we set things up by either passing the table_size explicitly or using ARRAY_SIZE on the ctl_table arrays. We replace the register_syctl function with a macro that will add the ARRAY_SIZE to the new register_sysctl_sz function. In this way the callers that are already using an array of ctl_table structs do not change. For the callers that pass a ctl_table array pointer, we pass the table_size to register_sysctl_sz instead of the macro. Signed-off-by: Joel Granados <j.granados at samsung.com> Suggested-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org> --- arch/arm64/kernel/armv8_deprecated.c | 2 +- arch/s390/appldata/appldata_base.c | 2 +- fs/proc/proc_sysctl.c | 30 +++++++++++++++------------- include/linux/sysctl.h | 10 ++++++++-- kernel/ucount.c | 2 +- net/sysctl_net.c | 2 +- 6 files changed, 28 insertions(+), 20 deletions(-) diff --git a/arch/arm64/kernel/armv8_deprecated.c b/arch/arm64/kernel/armv8_deprecated.c index 1febd412b4d2..e459cfd33711 100644 --- a/arch/arm64/kernel/armv8_deprecated.c +++ b/arch/arm64/kernel/armv8_deprecated.c @@ -569,7 +569,7 @@ static void __init register_insn_emulation(struct insn_emulation *insn) sysctl->extra2 = &insn->max; sysctl->proc_handler = emulation_proc_handler; - register_sysctl("abi", sysctl); + register_sysctl_sz("abi", sysctl, 1); } } diff --git a/arch/s390/appldata/appldata_base.c b/arch/s390/appldata/appldata_base.c index bbefe5e86bdf..3b0994625652 100644 --- a/arch/s390/appldata/appldata_base.c +++ b/arch/s390/appldata/appldata_base.c @@ -365,7 +365,7 @@ int appldata_register_ops(struct appldata_ops *ops) ops->ctl_table[0].proc_handler = appldata_generic_handler; ops->ctl_table[0].data = ops; - ops->sysctl_header = register_sysctl(appldata_proc_name, ops->ctl_table); + ops->sysctl_header = register_sysctl_sz(appldata_proc_name, ops->ctl_table, 1); if (!ops->sysctl_header) goto out; return 0; diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index b8dd78e344ff..80d3e2f61947 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -43,7 +43,7 @@ static struct ctl_table sysctl_mount_point[] = { */ struct ctl_table_header *register_sysctl_mount_point(const char *path) { - return register_sysctl(path, sysctl_mount_point); + return register_sysctl_sz(path, sysctl_mount_point, 0); } EXPORT_SYMBOL(register_sysctl_mount_point); @@ -1399,7 +1399,7 @@ struct ctl_table_header *__register_sysctl_table( } /** - * register_sysctl - register a sysctl table + * register_sysctl_sz - register a sysctl table * @path: The path to the directory the sysctl table is in. If the path * doesn't exist we will create it for you. * @table: the table structure. The calller must ensure the life of the @table @@ -1409,25 +1409,20 @@ struct ctl_table_header *__register_sysctl_table( * to call unregister_sysctl_table() and can instead use something like * register_sysctl_init() which does not care for the result of the syctl * registration. + * @table_size: The number of elements in table. * * Register a sysctl table. @table should be a filled in ctl_table * array. A completely 0 filled entry terminates the table. * * See __register_sysctl_table for more details. */ -struct ctl_table_header *register_sysctl(const char *path, struct ctl_table *table) +struct ctl_table_header *register_sysctl_sz(const char *path, struct ctl_table *table, + size_t table_size) { - int count = 0; - struct ctl_table *entry; - struct ctl_table_header t_hdr; - - t_hdr.ctl_table = table; - list_for_each_table_entry(entry, (&t_hdr)) - count++; return __register_sysctl_table(&sysctl_table_root.default_set, - path, table, count); + path, table, table_size); } -EXPORT_SYMBOL(register_sysctl); +EXPORT_SYMBOL(register_sysctl_sz); /** * __register_sysctl_init() - register sysctl table to path @@ -1452,10 +1447,17 @@ EXPORT_SYMBOL(register_sysctl); void __init __register_sysctl_init(const char *path, struct ctl_table *table, const char *table_name) { - struct ctl_table_header *hdr = register_sysctl(path, table); + int count = 0; + struct ctl_table *entry; + struct ctl_table_header t_hdr, *hdr; + + t_hdr.ctl_table = table; + list_for_each_table_entry(entry, (&t_hdr)) + count++; + hdr = register_sysctl_sz(path, table, count); if (unlikely(!hdr)) { - pr_err("failed when register_sysctl %s to %s\n", table_name, path); + pr_err("failed when register_sysctl_sz %s to %s\n", table_name, path); return; } kmemleak_not_leak(hdr); diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h index 0495c858989f..b1168ae281c9 100644 --- a/include/linux/sysctl.h +++ b/include/linux/sysctl.h @@ -215,6 +215,9 @@ struct ctl_path { const char *procname; }; +#define register_sysctl(path, table) \ + register_sysctl_sz(path, table, ARRAY_SIZE(table)) + #ifdef CONFIG_SYSCTL void proc_sys_poll_notify(struct ctl_table_poll *poll); @@ -227,7 +230,8 @@ extern void retire_sysctl_set(struct ctl_table_set *set); struct ctl_table_header *__register_sysctl_table( struct ctl_table_set *set, const char *path, struct ctl_table *table, size_t table_size); -struct ctl_table_header *register_sysctl(const char *path, struct ctl_table *table); +struct ctl_table_header *register_sysctl_sz(const char *path, struct ctl_table *table, + size_t table_size); void unregister_sysctl_table(struct ctl_table_header * table); extern int sysctl_init_bases(void); @@ -262,7 +266,9 @@ static inline struct ctl_table_header *register_sysctl_mount_point(const char *p return NULL; } -static inline struct ctl_table_header *register_sysctl(const char *path, struct ctl_table *table) +static inline struct ctl_table_header *register_sysctl_sz(const char *path, + struct ctl_table *table, + size_t table_size) { return NULL; } diff --git a/kernel/ucount.c b/kernel/ucount.c index 2b80264bb79f..4aa6166cb856 100644 --- a/kernel/ucount.c +++ b/kernel/ucount.c @@ -365,7 +365,7 @@ static __init int user_namespace_sysctl_init(void) * default set so that registrations in the child sets work * properly. */ - user_header = register_sysctl("user", empty); + user_header = register_sysctl_sz("user", empty, 0); kmemleak_ignore(user_header); BUG_ON(!user_header); BUG_ON(!setup_userns_sysctls(&init_user_ns)); diff --git a/net/sysctl_net.c b/net/sysctl_net.c index 8ee4b74bc009..d9cbbb51b143 100644 --- a/net/sysctl_net.c +++ b/net/sysctl_net.c @@ -101,7 +101,7 @@ __init int net_sysctl_init(void) * registering "/proc/sys/net" as an empty directory not in a * network namespace. */ - net_header = register_sysctl("net", empty); + net_header = register_sysctl_sz("net", empty, 0); if (!net_header) goto out; ret = register_pernet_subsys(&sysctl_pernet_ops); -- 2.30.2
Joel Granados
2023-Jul-31 07:17 UTC
[Bridge] [PATCH v2 07/14] sysctl: Add size arg to __register_sysctl_init
This commit adds table_size to __register_sysctl_init in preparation for the removal of the sentinel elements in the ctl_table arrays (last empty markers). And though we do *not* remove any sentinels in this commit, we set things up by calculating the ctl_table array size with ARRAY_SIZE. We add a table_size argument to __register_sysctl_init and modify the register_sysctl_init macro to calculate the array size with ARRAY_SIZE. The original callers do not need to be updated as they will go through the new macro. Signed-off-by: Joel Granados <j.granados at samsung.com> Suggested-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org> --- fs/proc/proc_sysctl.c | 12 +++--------- include/linux/sysctl.h | 5 +++-- 2 files changed, 6 insertions(+), 11 deletions(-) diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index 80d3e2f61947..817bc51c58d8 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -1433,6 +1433,7 @@ EXPORT_SYMBOL(register_sysctl_sz); * lifetime use of the sysctl. * @table_name: The name of sysctl table, only used for log printing when * registration fails + * @table_size: The number of elements in table * * The sysctl interface is used by userspace to query or modify at runtime * a predefined value set on a variable. These variables however have default @@ -1445,16 +1446,9 @@ EXPORT_SYMBOL(register_sysctl_sz); * Context: if your base directory does not exist it will be created for you. */ void __init __register_sysctl_init(const char *path, struct ctl_table *table, - const char *table_name) + const char *table_name, size_t table_size) { - int count = 0; - struct ctl_table *entry; - struct ctl_table_header t_hdr, *hdr; - - t_hdr.ctl_table = table; - list_for_each_table_entry(entry, (&t_hdr)) - count++; - hdr = register_sysctl_sz(path, table, count); + struct ctl_table_header *hdr = register_sysctl_sz(path, table, table_size); if (unlikely(!hdr)) { pr_err("failed when register_sysctl_sz %s to %s\n", table_name, path); diff --git a/include/linux/sysctl.h b/include/linux/sysctl.h index b1168ae281c9..09d7429d67c0 100644 --- a/include/linux/sysctl.h +++ b/include/linux/sysctl.h @@ -236,8 +236,9 @@ void unregister_sysctl_table(struct ctl_table_header * table); extern int sysctl_init_bases(void); extern void __register_sysctl_init(const char *path, struct ctl_table *table, - const char *table_name); -#define register_sysctl_init(path, table) __register_sysctl_init(path, table, #table) + const char *table_name, size_t table_size); +#define register_sysctl_init(path, table) \ + __register_sysctl_init(path, table, #table, ARRAY_SIZE(table)) extern struct ctl_table_header *register_sysctl_mount_point(const char *path); void do_sysctl_args(void); -- 2.30.2
Joel Granados
2023-Jul-31 07:17 UTC
[Bridge] [PATCH v2 08/14] sysctl: Add size to register_net_sysctl function
This commit adds size to the register_net_sysctl indirection function to facilitate the removal of the sentinel elements (last empty markers) from the ctl_table arrays. Though we don't actually remove any sentinels in this commit, register_net_sysctl* now has the capability of forwarding table_size for when that happens. We create a new function register_net_sysctl_sz with an extra size argument. A macro replaces the existing register_net_sysctl. The size in the macro is SIZE_MAX instead of ARRAY_SIZE to avoid compilation errors while we systematically migrate to register_net_sysctl_sz. Will change to ARRAY_SIZE in subsequent commits. Care is taken to add table_size to the stopping criteria in such a way that when we remove the empty sentinel element, it will continue stopping in the last element of the ctl_table array. Signed-off-by: Joel Granados <j.granados at samsung.com> Suggested-by: Greg Kroah-Hartman <gregkh at linuxfoundation.org> --- include/net/net_namespace.h | 10 ++++++---- net/sysctl_net.c | 22 +++++++++++++--------- 2 files changed, 19 insertions(+), 13 deletions(-) diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h index 78beaa765c73..e4e5fe75a281 100644 --- a/include/net/net_namespace.h +++ b/include/net/net_namespace.h @@ -469,15 +469,17 @@ void unregister_pernet_device(struct pernet_operations *); struct ctl_table; +#define register_net_sysctl(net, path, table) \ + register_net_sysctl_sz(net, path, table, SIZE_MAX) #ifdef CONFIG_SYSCTL int net_sysctl_init(void); -struct ctl_table_header *register_net_sysctl(struct net *net, const char *path, - struct ctl_table *table); +struct ctl_table_header *register_net_sysctl_sz(struct net *net, const char *path, + struct ctl_table *table, size_t table_size); void unregister_net_sysctl_table(struct ctl_table_header *header); #else static inline int net_sysctl_init(void) { return 0; } -static inline struct ctl_table_header *register_net_sysctl(struct net *net, - const char *path, struct ctl_table *table) +static inline struct ctl_table_header *register_net_sysctl_sz(struct net *net, + const char *path, struct ctl_table *table, size_t table_size) { return NULL; } diff --git a/net/sysctl_net.c b/net/sysctl_net.c index d9cbbb51b143..051ed5f6fc93 100644 --- a/net/sysctl_net.c +++ b/net/sysctl_net.c @@ -122,12 +122,13 @@ __init int net_sysctl_init(void) * allocated. */ static void ensure_safe_net_sysctl(struct net *net, const char *path, - struct ctl_table *table) + struct ctl_table *table, size_t table_size) { struct ctl_table *ent; pr_debug("Registering net sysctl (net %p): %s\n", net, path); - for (ent = table; ent->procname; ent++) { + ent = table; + for (size_t i = 0; i < table_size && ent->procname; ent++, i++) { unsigned long addr; const char *where; @@ -160,21 +161,24 @@ static void ensure_safe_net_sysctl(struct net *net, const char *path, } } -struct ctl_table_header *register_net_sysctl(struct net *net, - const char *path, struct ctl_table *table) +struct ctl_table_header *register_net_sysctl_sz(struct net *net, + const char *path, + struct ctl_table *table, + size_t table_size) { - int count = 0; + int count; struct ctl_table *entry; if (!net_eq(net, &init_net)) - ensure_safe_net_sysctl(net, path, table); + ensure_safe_net_sysctl(net, path, table, table_size); - for (entry = table; entry->procname; entry++) - count++; + entry = table; + for (count = 0 ; count < table_size && entry->procname; entry++, count++) + ; return __register_sysctl_table(&net->sysctls, path, table, count); } -EXPORT_SYMBOL_GPL(register_net_sysctl); +EXPORT_SYMBOL_GPL(register_net_sysctl_sz); void unregister_net_sysctl_table(struct ctl_table_header *header) { -- 2.30.2
Joel Granados
2023-Jul-31 07:17 UTC
[Bridge] [PATCH v2 09/14] ax.25: Update to register_net_sysctl_sz
Move from register_net_sysctl to register_net_sysctl_sz and pass the ARRAY_SIZE of the ctl_table array that was used to create the table variable. We need to move to the new function in preparation for when we change SIZE_MAX to ARRAY_SIZE() in the register_net_sysctl macro. Failing to do so would erroneously allow ARRAY_SIZE() to be called on a pointer. We hold off the SIZE_MAX to ARRAY_SIZE change until we have migrated all the relevant net sysctl registering functions to register_net_sysctl_sz in subsequent commits. Signed-off-by: Joel Granados <j.granados at samsung.com> --- net/ax25/sysctl_net_ax25.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/ax25/sysctl_net_ax25.c b/net/ax25/sysctl_net_ax25.c index 2154d004d3dc..db66e11e7fe8 100644 --- a/net/ax25/sysctl_net_ax25.c +++ b/net/ax25/sysctl_net_ax25.c @@ -159,7 +159,8 @@ int ax25_register_dev_sysctl(ax25_dev *ax25_dev) table[k].data = &ax25_dev->values[k]; snprintf(path, sizeof(path), "net/ax25/%s", ax25_dev->dev->name); - ax25_dev->sysheader = register_net_sysctl(&init_net, path, table); + ax25_dev->sysheader = register_net_sysctl_sz(&init_net, path, table, + ARRAY_SIZE(ax25_param_table)); if (!ax25_dev->sysheader) { kfree(table); return -ENOMEM; -- 2.30.2
Joel Granados
2023-Jul-31 07:17 UTC
[Bridge] [PATCH v2 10/14] netfilter: Update to register_net_sysctl_sz
Move from register_net_sysctl to register_net_sysctl_sz for all the netfilter related files. Do this while making sure to mirror the NULL assignments with a table_size of zero for the unprivileged users. We need to move to the new function in preparation for when we change SIZE_MAX to ARRAY_SIZE() in the register_net_sysctl macro. Failing to do so would erroneously allow ARRAY_SIZE() to be called on a pointer. We hold off the SIZE_MAX to ARRAY_SIZE change until we have migrated all the relevant net sysctl registering functions to register_net_sysctl_sz in subsequent commits. Signed-off-by: Joel Granados <j.granados at samsung.com> --- net/bridge/br_netfilter_hooks.c | 3 ++- net/ipv6/netfilter/nf_conntrack_reasm.c | 3 ++- net/netfilter/ipvs/ip_vs_ctl.c | 8 ++++++-- net/netfilter/ipvs/ip_vs_lblc.c | 10 +++++++--- net/netfilter/ipvs/ip_vs_lblcr.c | 10 +++++++--- net/netfilter/nf_conntrack_standalone.c | 4 +++- net/netfilter/nf_log.c | 7 ++++--- 7 files changed, 31 insertions(+), 14 deletions(-) diff --git a/net/bridge/br_netfilter_hooks.c b/net/bridge/br_netfilter_hooks.c index 1a801fab9543..15186247b59a 100644 --- a/net/bridge/br_netfilter_hooks.c +++ b/net/bridge/br_netfilter_hooks.c @@ -1135,7 +1135,8 @@ static int br_netfilter_sysctl_init_net(struct net *net) br_netfilter_sysctl_default(brnet); - brnet->ctl_hdr = register_net_sysctl(net, "net/bridge", table); + brnet->ctl_hdr = register_net_sysctl_sz(net, "net/bridge", table, + ARRAY_SIZE(brnf_table)); if (!brnet->ctl_hdr) { if (!net_eq(net, &init_net)) kfree(table); diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c b/net/ipv6/netfilter/nf_conntrack_reasm.c index d13240f13607..b2dd48911c8d 100644 --- a/net/ipv6/netfilter/nf_conntrack_reasm.c +++ b/net/ipv6/netfilter/nf_conntrack_reasm.c @@ -87,7 +87,8 @@ static int nf_ct_frag6_sysctl_register(struct net *net) table[2].data = &nf_frag->fqdir->high_thresh; table[2].extra1 = &nf_frag->fqdir->low_thresh; - hdr = register_net_sysctl(net, "net/netfilter", table); + hdr = register_net_sysctl_sz(net, "net/netfilter", table, + ARRAY_SIZE(nf_ct_frag6_sysctl_table)); if (hdr == NULL) goto err_reg; diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c index 62606fb44d02..8d69e4c2d822 100644 --- a/net/netfilter/ipvs/ip_vs_ctl.c +++ b/net/netfilter/ipvs/ip_vs_ctl.c @@ -4266,6 +4266,7 @@ static int __net_init ip_vs_control_net_init_sysctl(struct netns_ipvs *ipvs) struct net *net = ipvs->net; struct ctl_table *tbl; int idx, ret; + size_t ctl_table_size = ARRAY_SIZE(vs_vars); atomic_set(&ipvs->dropentry, 0); spin_lock_init(&ipvs->dropentry_lock); @@ -4282,8 +4283,10 @@ static int __net_init ip_vs_control_net_init_sysctl(struct netns_ipvs *ipvs) return -ENOMEM; /* Don't export sysctls to unprivileged users */ - if (net->user_ns != &init_user_ns) + if (net->user_ns != &init_user_ns) { tbl[0].procname = NULL; + ctl_table_size = 0; + } } else tbl = vs_vars; /* Initialize sysctl defaults */ @@ -4353,7 +4356,8 @@ static int __net_init ip_vs_control_net_init_sysctl(struct netns_ipvs *ipvs) #endif ret = -ENOMEM; - ipvs->sysctl_hdr = register_net_sysctl(net, "net/ipv4/vs", tbl); + ipvs->sysctl_hdr = register_net_sysctl_sz(net, "net/ipv4/vs", tbl, + ctl_table_size); if (!ipvs->sysctl_hdr) goto err; ipvs->sysctl_tbl = tbl; diff --git a/net/netfilter/ipvs/ip_vs_lblc.c b/net/netfilter/ipvs/ip_vs_lblc.c index 1b87214d385e..cf78ba4ce5ff 100644 --- a/net/netfilter/ipvs/ip_vs_lblc.c +++ b/net/netfilter/ipvs/ip_vs_lblc.c @@ -550,6 +550,7 @@ static struct ip_vs_scheduler ip_vs_lblc_scheduler = { static int __net_init __ip_vs_lblc_init(struct net *net) { struct netns_ipvs *ipvs = net_ipvs(net); + size_t vars_table_size = ARRAY_SIZE(vs_vars_table); if (!ipvs) return -ENOENT; @@ -562,16 +563,19 @@ static int __net_init __ip_vs_lblc_init(struct net *net) return -ENOMEM; /* Don't export sysctls to unprivileged users */ - if (net->user_ns != &init_user_ns) + if (net->user_ns != &init_user_ns) { ipvs->lblc_ctl_table[0].procname = NULL; + vars_table_size = 0; + } } else ipvs->lblc_ctl_table = vs_vars_table; ipvs->sysctl_lblc_expiration = DEFAULT_EXPIRATION; ipvs->lblc_ctl_table[0].data = &ipvs->sysctl_lblc_expiration; - ipvs->lblc_ctl_header - register_net_sysctl(net, "net/ipv4/vs", ipvs->lblc_ctl_table); + ipvs->lblc_ctl_header = register_net_sysctl_sz(net, "net/ipv4/vs", + ipvs->lblc_ctl_table, + vars_table_size); if (!ipvs->lblc_ctl_header) { if (!net_eq(net, &init_net)) kfree(ipvs->lblc_ctl_table); diff --git a/net/netfilter/ipvs/ip_vs_lblcr.c b/net/netfilter/ipvs/ip_vs_lblcr.c index ad8f5fea6d3a..9eddf118b40e 100644 --- a/net/netfilter/ipvs/ip_vs_lblcr.c +++ b/net/netfilter/ipvs/ip_vs_lblcr.c @@ -736,6 +736,7 @@ static struct ip_vs_scheduler ip_vs_lblcr_scheduler static int __net_init __ip_vs_lblcr_init(struct net *net) { struct netns_ipvs *ipvs = net_ipvs(net); + size_t vars_table_size = ARRAY_SIZE(vs_vars_table); if (!ipvs) return -ENOENT; @@ -748,15 +749,18 @@ static int __net_init __ip_vs_lblcr_init(struct net *net) return -ENOMEM; /* Don't export sysctls to unprivileged users */ - if (net->user_ns != &init_user_ns) + if (net->user_ns != &init_user_ns) { ipvs->lblcr_ctl_table[0].procname = NULL; + vars_table_size = 0; + } } else ipvs->lblcr_ctl_table = vs_vars_table; ipvs->sysctl_lblcr_expiration = DEFAULT_EXPIRATION; ipvs->lblcr_ctl_table[0].data = &ipvs->sysctl_lblcr_expiration; - ipvs->lblcr_ctl_header - register_net_sysctl(net, "net/ipv4/vs", ipvs->lblcr_ctl_table); + ipvs->lblcr_ctl_header = register_net_sysctl_sz(net, "net/ipv4/vs", + ipvs->lblcr_ctl_table, + vars_table_size); if (!ipvs->lblcr_ctl_header) { if (!net_eq(net, &init_net)) kfree(ipvs->lblcr_ctl_table); diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c index 169e16fc2bce..0ee98ce5b816 100644 --- a/net/netfilter/nf_conntrack_standalone.c +++ b/net/netfilter/nf_conntrack_standalone.c @@ -1106,7 +1106,9 @@ static int nf_conntrack_standalone_init_sysctl(struct net *net) table[NF_SYSCTL_CT_BUCKETS].mode = 0444; } - cnet->sysctl_header = register_net_sysctl(net, "net/netfilter", table); + cnet->sysctl_header = register_net_sysctl_sz(net, "net/netfilter", + table, + ARRAY_SIZE(nf_ct_sysctl_table)); if (!cnet->sysctl_header) goto out_unregister_netfilter; diff --git a/net/netfilter/nf_log.c b/net/netfilter/nf_log.c index 8a29290149bd..8cc52d2bd31b 100644 --- a/net/netfilter/nf_log.c +++ b/net/netfilter/nf_log.c @@ -487,9 +487,10 @@ static int netfilter_log_sysctl_init(struct net *net) for (i = NFPROTO_UNSPEC; i < NFPROTO_NUMPROTO; i++) table[i].extra2 = net; - net->nf.nf_log_dir_header = register_net_sysctl(net, - "net/netfilter/nf_log", - table); + net->nf.nf_log_dir_header = register_net_sysctl_sz(net, + "net/netfilter/nf_log", + table, + ARRAY_SIZE(nf_log_sysctl_table)); if (!net->nf.nf_log_dir_header) goto err_reg; -- 2.30.2
Joel Granados
2023-Jul-31 07:17 UTC
[Bridge] [PATCH v2 11/14] networking: Update to register_net_sysctl_sz
Move from register_net_sysctl to register_net_sysctl_sz for all the networking related files. Do this while making sure to mirror the NULL assignments with a table_size of zero for the unprivileged users. We need to move to the new function in preparation for when we change SIZE_MAX to ARRAY_SIZE() in the register_net_sysctl macro. Failing to do so would erroneously allow ARRAY_SIZE() to be called on a pointer. We hold off the SIZE_MAX to ARRAY_SIZE change until we have migrated all the relevant net sysctl registering functions to register_net_sysctl_sz in subsequent commits. An additional size function was added to the following files in order to calculate the size of an array that is defined in another file: include/net/ipv6.h net/ipv6/icmp.c net/ipv6/route.c net/ipv6/sysctl_net_ipv6.c Signed-off-by: Joel Granados <j.granados at samsung.com> --- include/net/ipv6.h | 2 ++ net/core/neighbour.c | 8 ++++++-- net/core/sysctl_net_core.c | 3 ++- net/ieee802154/6lowpan/reassembly.c | 8 ++++++-- net/ipv4/devinet.c | 3 ++- net/ipv4/ip_fragment.c | 3 ++- net/ipv4/route.c | 8 ++++++-- net/ipv4/sysctl_net_ipv4.c | 3 ++- net/ipv4/xfrm4_policy.c | 3 ++- net/ipv6/addrconf.c | 3 ++- net/ipv6/icmp.c | 5 +++++ net/ipv6/reassembly.c | 3 ++- net/ipv6/route.c | 13 +++++++++---- net/ipv6/sysctl_net_ipv6.c | 16 +++++++++++----- net/ipv6/xfrm6_policy.c | 3 ++- net/mpls/af_mpls.c | 6 ++++-- net/mptcp/ctrl.c | 3 ++- net/rds/tcp.c | 3 ++- net/sctp/sysctl.c | 4 +++- net/smc/smc_sysctl.c | 3 ++- net/unix/sysctl_net_unix.c | 3 ++- net/xfrm/xfrm_sysctl.c | 8 ++++++-- 22 files changed, 82 insertions(+), 32 deletions(-) diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 7332296eca44..63ba68536a20 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -1274,7 +1274,9 @@ static inline int snmp6_unregister_dev(struct inet6_dev *idev) { return 0; } #ifdef CONFIG_SYSCTL struct ctl_table *ipv6_icmp_sysctl_init(struct net *net); +size_t ipv6_icmp_sysctl_table_size(void); struct ctl_table *ipv6_route_sysctl_init(struct net *net); +size_t ipv6_route_sysctl_table_size(struct net *net); int ipv6_sysctl_register(void); void ipv6_sysctl_unregister(void); #endif diff --git a/net/core/neighbour.c b/net/core/neighbour.c index ddd0f32de20e..adc7fc4ff9bf 100644 --- a/net/core/neighbour.c +++ b/net/core/neighbour.c @@ -3779,6 +3779,7 @@ int neigh_sysctl_register(struct net_device *dev, struct neigh_parms *p, const char *dev_name_source; char neigh_path[ sizeof("net//neigh/") + IFNAMSIZ + IFNAMSIZ ]; char *p_name; + size_t neigh_vars_size; t = kmemdup(&neigh_sysctl_template, sizeof(*t), GFP_KERNEL_ACCOUNT); if (!t) @@ -3790,11 +3791,13 @@ int neigh_sysctl_register(struct net_device *dev, struct neigh_parms *p, t->neigh_vars[i].extra2 = p; } + neigh_vars_size = ARRAY_SIZE(t->neigh_vars); if (dev) { dev_name_source = dev->name; /* Terminate the table early */ memset(&t->neigh_vars[NEIGH_VAR_GC_INTERVAL], 0, sizeof(t->neigh_vars[NEIGH_VAR_GC_INTERVAL])); + neigh_vars_size = NEIGH_VAR_BASE_REACHABLE_TIME_MS; } else { struct neigh_table *tbl = p->tbl; dev_name_source = "default"; @@ -3841,8 +3844,9 @@ int neigh_sysctl_register(struct net_device *dev, struct neigh_parms *p, snprintf(neigh_path, sizeof(neigh_path), "net/%s/neigh/%s", p_name, dev_name_source); - t->sysctl_header - register_net_sysctl(neigh_parms_net(p), neigh_path, t->neigh_vars); + t->sysctl_header = register_net_sysctl_sz(neigh_parms_net(p), + neigh_path, t->neigh_vars, + neigh_vars_size); if (!t->sysctl_header) goto free; diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c index 782273bb93c2..03f1edb948d7 100644 --- a/net/core/sysctl_net_core.c +++ b/net/core/sysctl_net_core.c @@ -712,7 +712,8 @@ static __net_init int sysctl_core_net_init(struct net *net) tmp->data += (char *)net - (char *)&init_net; } - net->core.sysctl_hdr = register_net_sysctl(net, "net/core", tbl); + net->core.sysctl_hdr = register_net_sysctl_sz(net, "net/core", tbl, + ARRAY_SIZE(netns_core_table)); if (net->core.sysctl_hdr == NULL) goto err_reg; diff --git a/net/ieee802154/6lowpan/reassembly.c b/net/ieee802154/6lowpan/reassembly.c index a91283d1e5bf..6dd960ec558c 100644 --- a/net/ieee802154/6lowpan/reassembly.c +++ b/net/ieee802154/6lowpan/reassembly.c @@ -360,6 +360,7 @@ static int __net_init lowpan_frags_ns_sysctl_register(struct net *net) struct ctl_table_header *hdr; struct netns_ieee802154_lowpan *ieee802154_lowpan net_ieee802154_lowpan(net); + size_t table_size = ARRAY_SIZE(lowpan_frags_ns_ctl_table); table = lowpan_frags_ns_ctl_table; if (!net_eq(net, &init_net)) { @@ -369,8 +370,10 @@ static int __net_init lowpan_frags_ns_sysctl_register(struct net *net) goto err_alloc; /* Don't export sysctls to unprivileged users */ - if (net->user_ns != &init_user_ns) + if (net->user_ns != &init_user_ns) { table[0].procname = NULL; + table_size = 0; + } } table[0].data = &ieee802154_lowpan->fqdir->high_thresh; @@ -379,7 +382,8 @@ static int __net_init lowpan_frags_ns_sysctl_register(struct net *net) table[1].extra2 = &ieee802154_lowpan->fqdir->high_thresh; table[2].data = &ieee802154_lowpan->fqdir->timeout; - hdr = register_net_sysctl(net, "net/ieee802154/6lowpan", table); + hdr = register_net_sysctl_sz(net, "net/ieee802154/6lowpan", table, + table_size); if (hdr == NULL) goto err_reg; diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c index 5deac0517ef7..89087844ea6e 100644 --- a/net/ipv4/devinet.c +++ b/net/ipv4/devinet.c @@ -2720,7 +2720,8 @@ static __net_init int devinet_init_net(struct net *net) goto err_reg_dflt; err = -ENOMEM; - forw_hdr = register_net_sysctl(net, "net/ipv4", tbl); + forw_hdr = register_net_sysctl_sz(net, "net/ipv4", tbl, + ARRAY_SIZE(ctl_forward_entry)); if (!forw_hdr) goto err_reg_ctl; net->ipv4.forw_hdr = forw_hdr; diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c index 69c00ffdcf3e..a4941f53b523 100644 --- a/net/ipv4/ip_fragment.c +++ b/net/ipv4/ip_fragment.c @@ -615,7 +615,8 @@ static int __net_init ip4_frags_ns_ctl_register(struct net *net) table[2].data = &net->ipv4.fqdir->timeout; table[3].data = &net->ipv4.fqdir->max_dist; - hdr = register_net_sysctl(net, "net/ipv4", table); + hdr = register_net_sysctl_sz(net, "net/ipv4", table, + ARRAY_SIZE(ip4_frags_ns_ctl_table)); if (!hdr) goto err_reg; diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 98d7e6ba7493..e7e9fba0357a 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -3592,6 +3592,7 @@ static struct ctl_table ipv4_route_netns_table[] = { static __net_init int sysctl_route_net_init(struct net *net) { struct ctl_table *tbl; + size_t table_size = ARRAY_SIZE(ipv4_route_netns_table); tbl = ipv4_route_netns_table; if (!net_eq(net, &init_net)) { @@ -3603,8 +3604,10 @@ static __net_init int sysctl_route_net_init(struct net *net) /* Don't export non-whitelisted sysctls to unprivileged users */ if (net->user_ns != &init_user_ns) { - if (tbl[0].procname != ipv4_route_flush_procname) + if (tbl[0].procname != ipv4_route_flush_procname) { tbl[0].procname = NULL; + table_size = 0; + } } /* Update the variables to point into the current struct net @@ -3615,7 +3618,8 @@ static __net_init int sysctl_route_net_init(struct net *net) } tbl[0].extra1 = net; - net->ipv4.route_hdr = register_net_sysctl(net, "net/ipv4/route", tbl); + net->ipv4.route_hdr = register_net_sysctl_sz(net, "net/ipv4/route", + tbl, table_size); if (!net->ipv4.route_hdr) goto err_reg; return 0; diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 2afb0870648b..6ac890b4073f 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -1519,7 +1519,8 @@ static __net_init int ipv4_sysctl_init_net(struct net *net) } } - net->ipv4.ipv4_hdr = register_net_sysctl(net, "net/ipv4", table); + net->ipv4.ipv4_hdr = register_net_sysctl_sz(net, "net/ipv4", table, + ARRAY_SIZE(ipv4_net_table)); if (!net->ipv4.ipv4_hdr) goto err_reg; diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c index 9403bbaf1b61..57ea394ffa8c 100644 --- a/net/ipv4/xfrm4_policy.c +++ b/net/ipv4/xfrm4_policy.c @@ -178,7 +178,8 @@ static __net_init int xfrm4_net_sysctl_init(struct net *net) table[0].data = &net->xfrm.xfrm4_dst_ops.gc_thresh; } - hdr = register_net_sysctl(net, "net/ipv4", table); + hdr = register_net_sysctl_sz(net, "net/ipv4", table, + ARRAY_SIZE(xfrm4_policy_table)); if (!hdr) goto err_reg; diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c index e5213e598a04..d615a84965c2 100644 --- a/net/ipv6/addrconf.c +++ b/net/ipv6/addrconf.c @@ -7085,7 +7085,8 @@ static int __addrconf_sysctl_register(struct net *net, char *dev_name, snprintf(path, sizeof(path), "net/ipv6/conf/%s", dev_name); - p->sysctl_header = register_net_sysctl(net, path, table); + p->sysctl_header = register_net_sysctl_sz(net, path, table, + ARRAY_SIZE(addrconf_sysctl)); if (!p->sysctl_header) goto free; diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c index 65fa5014bc85..a76b01b41b57 100644 --- a/net/ipv6/icmp.c +++ b/net/ipv6/icmp.c @@ -1229,4 +1229,9 @@ struct ctl_table * __net_init ipv6_icmp_sysctl_init(struct net *net) } return table; } + +size_t ipv6_icmp_sysctl_table_size(void) +{ + return ARRAY_SIZE(ipv6_icmp_table_template); +} #endif diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c index 5bc8a28e67f9..5ebc47da1000 100644 --- a/net/ipv6/reassembly.c +++ b/net/ipv6/reassembly.c @@ -470,7 +470,8 @@ static int __net_init ip6_frags_ns_sysctl_register(struct net *net) table[1].extra2 = &net->ipv6.fqdir->high_thresh; table[2].data = &net->ipv6.fqdir->timeout; - hdr = register_net_sysctl(net, "net/ipv6", table); + hdr = register_net_sysctl_sz(net, "net/ipv6", table, + ARRAY_SIZE(ip6_frags_ns_ctl_table)); if (!hdr) goto err_reg; diff --git a/net/ipv6/route.c b/net/ipv6/route.c index 64e873f5895f..51c6cdae8723 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -6447,14 +6447,19 @@ struct ctl_table * __net_init ipv6_route_sysctl_init(struct net *net) table[8].data = &net->ipv6.sysctl.ip6_rt_min_advmss; table[9].data = &net->ipv6.sysctl.ip6_rt_gc_min_interval; table[10].data = &net->ipv6.sysctl.skip_notify_on_dev_down; - - /* Don't export sysctls to unprivileged users */ - if (net->user_ns != &init_user_ns) - table[1].procname = NULL; } return table; } + +size_t ipv6_route_sysctl_table_size(struct net *net) +{ + /* Don't export sysctls to unprivileged users */ + if (net->user_ns != &init_user_ns) + return 0; + + return ARRAY_SIZE(ipv6_route_table_template); +} #endif static int __net_init ip6_route_net_init(struct net *net) diff --git a/net/ipv6/sysctl_net_ipv6.c b/net/ipv6/sysctl_net_ipv6.c index 94a0a294c6a1..888676163e90 100644 --- a/net/ipv6/sysctl_net_ipv6.c +++ b/net/ipv6/sysctl_net_ipv6.c @@ -275,17 +275,23 @@ static int __net_init ipv6_sysctl_net_init(struct net *net) if (!ipv6_icmp_table) goto out_ipv6_route_table; - net->ipv6.sysctl.hdr = register_net_sysctl(net, "net/ipv6", ipv6_table); + net->ipv6.sysctl.hdr = register_net_sysctl_sz(net, "net/ipv6", + ipv6_table, + ARRAY_SIZE(ipv6_table_template)); if (!net->ipv6.sysctl.hdr) goto out_ipv6_icmp_table; - net->ipv6.sysctl.route_hdr - register_net_sysctl(net, "net/ipv6/route", ipv6_route_table); + net->ipv6.sysctl.route_hdr = register_net_sysctl_sz(net, + "net/ipv6/route", + ipv6_route_table, + ipv6_route_sysctl_table_size(net)); if (!net->ipv6.sysctl.route_hdr) goto out_unregister_ipv6_table; - net->ipv6.sysctl.icmp_hdr - register_net_sysctl(net, "net/ipv6/icmp", ipv6_icmp_table); + net->ipv6.sysctl.icmp_hdr = register_net_sysctl_sz(net, + "net/ipv6/icmp", + ipv6_icmp_table, + ipv6_icmp_sysctl_table_size()); if (!net->ipv6.sysctl.icmp_hdr) goto out_unregister_route_table; diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c index eecc5e59da17..8f931e46b460 100644 --- a/net/ipv6/xfrm6_policy.c +++ b/net/ipv6/xfrm6_policy.c @@ -205,7 +205,8 @@ static int __net_init xfrm6_net_sysctl_init(struct net *net) table[0].data = &net->xfrm.xfrm6_dst_ops.gc_thresh; } - hdr = register_net_sysctl(net, "net/ipv6", table); + hdr = register_net_sysctl_sz(net, "net/ipv6", table, + ARRAY_SIZE(xfrm6_policy_table)); if (!hdr) goto err_reg; diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c index bf6e81d56263..1af29af65388 100644 --- a/net/mpls/af_mpls.c +++ b/net/mpls/af_mpls.c @@ -1419,7 +1419,8 @@ static int mpls_dev_sysctl_register(struct net_device *dev, snprintf(path, sizeof(path), "net/mpls/conf/%s", dev->name); - mdev->sysctl = register_net_sysctl(net, path, table); + mdev->sysctl = register_net_sysctl_sz(net, path, table, + ARRAY_SIZE(mpls_dev_table)); if (!mdev->sysctl) goto free; @@ -2689,7 +2690,8 @@ static int mpls_net_init(struct net *net) for (i = 0; i < ARRAY_SIZE(mpls_table) - 1; i++) table[i].data = (char *)net + (uintptr_t)table[i].data; - net->mpls.ctl = register_net_sysctl(net, "net/mpls", table); + net->mpls.ctl = register_net_sysctl_sz(net, "net/mpls", table, + ARRAY_SIZE(mpls_table)); if (net->mpls.ctl == NULL) { kfree(table); return -ENOMEM; diff --git a/net/mptcp/ctrl.c b/net/mptcp/ctrl.c index ae20b7d92e28..43e540328a52 100644 --- a/net/mptcp/ctrl.c +++ b/net/mptcp/ctrl.c @@ -150,7 +150,8 @@ static int mptcp_pernet_new_table(struct net *net, struct mptcp_pernet *pernet) table[4].data = &pernet->stale_loss_cnt; table[5].data = &pernet->pm_type; - hdr = register_net_sysctl(net, MPTCP_SYSCTL_PATH, table); + hdr = register_net_sysctl_sz(net, MPTCP_SYSCTL_PATH, table, + ARRAY_SIZE(mptcp_sysctl_table)); if (!hdr) goto err_reg; diff --git a/net/rds/tcp.c b/net/rds/tcp.c index c5b86066ff66..2dba7505b414 100644 --- a/net/rds/tcp.c +++ b/net/rds/tcp.c @@ -565,7 +565,8 @@ static __net_init int rds_tcp_init_net(struct net *net) } tbl[RDS_TCP_SNDBUF].data = &rtn->sndbuf_size; tbl[RDS_TCP_RCVBUF].data = &rtn->rcvbuf_size; - rtn->rds_tcp_sysctl = register_net_sysctl(net, "net/rds/tcp", tbl); + rtn->rds_tcp_sysctl = register_net_sysctl_sz(net, "net/rds/tcp", tbl, + ARRAY_SIZE(rds_tcp_sysctl_table)); if (!rtn->rds_tcp_sysctl) { pr_warn("could not register sysctl\n"); err = -ENOMEM; diff --git a/net/sctp/sysctl.c b/net/sctp/sysctl.c index a7a9136198fd..f65d6f92afcb 100644 --- a/net/sctp/sysctl.c +++ b/net/sctp/sysctl.c @@ -612,7 +612,9 @@ int sctp_sysctl_net_register(struct net *net) table[SCTP_PF_RETRANS_IDX].extra2 = &net->sctp.ps_retrans; table[SCTP_PS_RETRANS_IDX].extra1 = &net->sctp.pf_retrans; - net->sctp.sysctl_header = register_net_sysctl(net, "net/sctp", table); + net->sctp.sysctl_header = register_net_sysctl_sz(net, "net/sctp", + table, + ARRAY_SIZE(sctp_net_table)); if (net->sctp.sysctl_header == NULL) { kfree(table); return -ENOMEM; diff --git a/net/smc/smc_sysctl.c b/net/smc/smc_sysctl.c index b6f79fabb9d3..3ab2d8eefc55 100644 --- a/net/smc/smc_sysctl.c +++ b/net/smc/smc_sysctl.c @@ -81,7 +81,8 @@ int __net_init smc_sysctl_net_init(struct net *net) table[i].data += (void *)net - (void *)&init_net; } - net->smc.smc_hdr = register_net_sysctl(net, "net/smc", table); + net->smc.smc_hdr = register_net_sysctl_sz(net, "net/smc", table, + ARRAY_SIZE(smc_table)); if (!net->smc.smc_hdr) goto err_reg; diff --git a/net/unix/sysctl_net_unix.c b/net/unix/sysctl_net_unix.c index 500129aa710c..3e84b31c355a 100644 --- a/net/unix/sysctl_net_unix.c +++ b/net/unix/sysctl_net_unix.c @@ -36,7 +36,8 @@ int __net_init unix_sysctl_register(struct net *net) table[0].data = &net->unx.sysctl_max_dgram_qlen; } - net->unx.ctl = register_net_sysctl(net, "net/unix", table); + net->unx.ctl = register_net_sysctl_sz(net, "net/unix", table, + ARRAY_SIZE(unix_table)); if (net->unx.ctl == NULL) goto err_reg; diff --git a/net/xfrm/xfrm_sysctl.c b/net/xfrm/xfrm_sysctl.c index 0c6c5ef65f9d..7fdeafc838a7 100644 --- a/net/xfrm/xfrm_sysctl.c +++ b/net/xfrm/xfrm_sysctl.c @@ -44,6 +44,7 @@ static struct ctl_table xfrm_table[] = { int __net_init xfrm_sysctl_init(struct net *net) { struct ctl_table *table; + size_t table_size = ARRAY_SIZE(xfrm_table); __xfrm_sysctl_init(net); @@ -56,10 +57,13 @@ int __net_init xfrm_sysctl_init(struct net *net) table[3].data = &net->xfrm.sysctl_acq_expires; /* Don't export sysctls to unprivileged users */ - if (net->user_ns != &init_user_ns) + if (net->user_ns != &init_user_ns) { table[0].procname = NULL; + table_size = 0; + } - net->xfrm.sysctl_hdr = register_net_sysctl(net, "net/core", table); + net->xfrm.sysctl_hdr = register_net_sysctl_sz(net, "net/core", table, + table_size); if (!net->xfrm.sysctl_hdr) goto out_register; return 0; -- 2.30.2
Joel Granados
2023-Jul-31 07:17 UTC
[Bridge] [PATCH v2 12/14] vrf: Update to register_net_sysctl_sz
Move from register_net_sysctl to register_net_sysctl_sz and pass the ARRAY_SIZE of the ctl_table array that was used to create the table variable. We need to move to the new function in preparation for when we change SIZE_MAX to ARRAY_SIZE() in the register_net_sysctl macro. Failing to do so would erroneously allow ARRAY_SIZE() to be called on a pointer. The actual change from SIZE_MAX to ARRAY_SIZE will take place in subsequent commits. Signed-off-by: Joel Granados <j.granados at samsung.com> --- drivers/net/vrf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c index bdb3a76a352e..f4c3df15a0e5 100644 --- a/drivers/net/vrf.c +++ b/drivers/net/vrf.c @@ -1979,7 +1979,8 @@ static int vrf_netns_init_sysctl(struct net *net, struct netns_vrf *nn_vrf) /* init the extra1 parameter with the reference to current netns */ table[0].extra1 = net; - nn_vrf->ctl_hdr = register_net_sysctl(net, "net/vrf", table); + nn_vrf->ctl_hdr = register_net_sysctl_sz(net, "net/vrf", table, + ARRAY_SIZE(vrf_table)); if (!nn_vrf->ctl_hdr) { kfree(table); return -ENOMEM; -- 2.30.2
Joel Granados
2023-Jul-31 07:17 UTC
[Bridge] [PATCH v2 13/14] sysctl: SIZE_MAX->ARRAY_SIZE in register_net_sysctl
Replace SIZE_MAX with ARRAY_SIZE in the register_net_sysctl macro. Now that all the callers to register_net_sysctl are actual arrays, we can call ARRAY_SIZE() without any compilation warnings. By calculating the actual array size, this commit is making sure that register_net_sysctl and all its callers forward the table_size into sysctl backend for when the sentinel elements in the ctl_table arrays (last empty markers) are removed. Without it the removal would fail lacking a stopping criteria for traversing the ctl_table arrays. Stopping condition continues to be based on both table size and the procname null test. This is needed in order to allow for the systematic removal al the sentinel element in subsequent commits: Before removing sentinel the stopping criteria will be the last null element. When the sentinel is removed then the (correct) size will take over. Signed-off-by: Joel Granados <j.granados at samsung.com> Suggested-by: Jani Nikula <jani.nikula at linux.intel.com> --- include/net/net_namespace.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h index e4e5fe75a281..75dba309e043 100644 --- a/include/net/net_namespace.h +++ b/include/net/net_namespace.h @@ -470,7 +470,7 @@ void unregister_pernet_device(struct pernet_operations *); struct ctl_table; #define register_net_sysctl(net, path, table) \ - register_net_sysctl_sz(net, path, table, SIZE_MAX) + register_net_sysctl_sz(net, path, table, ARRAY_SIZE(table)) #ifdef CONFIG_SYSCTL int net_sysctl_init(void); struct ctl_table_header *register_net_sysctl_sz(struct net *net, const char *path, -- 2.30.2
Joel Granados
2023-Jul-31 07:17 UTC
[Bridge] [PATCH v2 14/14] sysctl: Use ctl_table_size as stopping criteria for list macro
This is a preparation commit to make it easy to remove the sentinel elements (empty end markers) from the ctl_table arrays. It both allows the systematic removal of the sentinels and adds the ctl_table_size variable to the stopping criteria of the list_for_each_table_entry macro that traverses all ctl_table arrays. Once all the sentinels are removed by subsequent commits, ctl_table_size will become the only stopping criteria in the macro. We don't actually remove any elements in this commit, but it sets things up to for the removal process to take place. By adding header->ctl_table_size as an additional stopping criteria for the list_for_each_table_entry macro, it will execute until it finds an "empty" ->procname or until the size runs out. Therefore if a ctl_table array with a sentinel is passed its size will be too big (by one element) but it will stop on the sentinel. On the other hand, if the ctl_table array without a sentinel is passed its size will be just write and there will be no need for a sentinel. Signed-off-by: Joel Granados <j.granados at samsung.com> Suggested-by: Jani Nikula <jani.nikula at linux.intel.com> --- fs/proc/proc_sysctl.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index 817bc51c58d8..504e847c2a3a 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -19,8 +19,9 @@ #include <linux/kmemleak.h> #include "internal.h" -#define list_for_each_table_entry(entry, header) \ - for ((entry) = (header->ctl_table); (entry)->procname; (entry)++) +#define list_for_each_table_entry(entry, header) \ + entry = header->ctl_table; \ + for (size_t i = 0 ; i < header->ctl_table_size && entry->procname; ++i, entry++) static const struct dentry_operations proc_sys_dentry_operations; static const struct file_operations proc_sys_file_operations; -- 2.30.2
Luis Chamberlain
2023-Jul-31 20:50 UTC
[Bridge] [PATCH v2 00/14] sysctl: Add a size argument to register functions in sysctl
On Mon, Jul 31, 2023 at 09:17:14AM +0200, Joel Granados wrote:> Why?It would be easier to read if the what went before the why.> This is a preparation patch set that will make it easier for us to apply > subsequent patches that will remove the sentinel element (last empty element) > in the ctl_table arrays. > > In itself, it does not remove any sentinels but it is needed to bring all the > advantages of the removal to fruition which is to help reduce the overall build > time size of the kernel and run time memory bloat by about ~64 bytes per > sentinel.s/sentinel/declared ctl array Because the you're suggesting we want to remove the sentinel but we want to help the patch reviewer know that a sentil is required per declared ctl array. You can also mention here briefly that this helps ensure that future moves of sysctl arrays out from kernel/sysctl.c to their own subsystem won't penalize in enlarging the kernel build size or run time memory consumption. Thanks for spinning this up again! Luis
Luis Chamberlain
2023-Jul-31 21:36 UTC
[Bridge] [PATCH v2 00/14] sysctl: Add a size argument to register functions in sysctl
> Joel Granados (14): > sysctl: Prefer ctl_table_header in proc_sysctl > sysctl: Use ctl_table_header in list_for_each_table_entry > sysctl: Add ctl_table_size to ctl_table_header > sysctl: Add size argument to init_header > sysctl: Add a size arg to __register_sysctl_table > sysctl: Add size to register_sysctl > sysctl: Add size arg to __register_sysctl_initThis is looking great thanks, I've taken the first 7 patches above to sysctl-next to get more exposure / testing and since we're already on rc4. Since the below patches involve more networking I'll wait to get more feedback from networking folks before merging them.> sysctl: Add size to register_net_sysctl function > ax.25: Update to register_net_sysctl_sz > netfilter: Update to register_net_sysctl_sz > networking: Update to register_net_sysctl_sz > vrf: Update to register_net_sysctl_sz > sysctl: SIZE_MAX->ARRAY_SIZE in register_net_sysctl > sysctl: Use ctl_table_size as stopping criteria for list macroLuis