Shakeel Butt
2018-Dec-29 19:34 UTC
[Bridge] [PATCH] netfilter: account ebt_table_info to kmemcg
On Sat, Dec 29, 2018 at 2:06 AM Michal Hocko <mhocko at kernel.org> wrote:> > On Sat 29-12-18 10:52:15, Florian Westphal wrote: > > Michal Hocko <mhocko at kernel.org> wrote: > > > On Fri 28-12-18 17:55:24, Shakeel Butt wrote: > > > > The [ip,ip6,arp]_tables use x_tables_info internally and the underlying > > > > memory is already accounted to kmemcg. Do the same for ebtables. The > > > > syzbot, by using setsockopt(EBT_SO_SET_ENTRIES), was able to OOM the > > > > whole system from a restricted memcg, a potential DoS. > > > > > > What is the lifetime of these objects? Are they bound to any process? > > > > No, they are not. > > They are free'd only when userspace requests it or the netns is > > destroyed. > > Then this is problematic, because the oom killer is not able to > guarantee the hard limit and so the excessive memory consumption cannot > be really contained. As a result the memcg will be basically useless > until somebody tears down the charged objects by other means. The memcg > oom killer will surely kill all the existing tasks in the cgroup and > this could somehow reduce the problem. Maybe this is sufficient for > some usecases but that should be properly analyzed and described in the > changelog. >Can you explain why you think the memcg hard limit will not be enforced? From what I understand, the memcg oom-killer will kill the allocating processes as you have mentioned. We do force charging for very limited conditions but here the memcg oom-killer will take care of Anyways, the kernel is already charging the memory for [ip,ip6,arp]_tables and this patch adds the charging for ebtables. Without this patch, as Kirill has described and shown by syzbot, a low priority memcg can force system OOM. Shakeel
Michal Hocko
2018-Dec-30 07:45 UTC
[Bridge] [PATCH] netfilter: account ebt_table_info to kmemcg
On Sat 29-12-18 11:34:29, Shakeel Butt wrote:> On Sat, Dec 29, 2018 at 2:06 AM Michal Hocko <mhocko at kernel.org> wrote: > > > > On Sat 29-12-18 10:52:15, Florian Westphal wrote: > > > Michal Hocko <mhocko at kernel.org> wrote: > > > > On Fri 28-12-18 17:55:24, Shakeel Butt wrote: > > > > > The [ip,ip6,arp]_tables use x_tables_info internally and the underlying > > > > > memory is already accounted to kmemcg. Do the same for ebtables. The > > > > > syzbot, by using setsockopt(EBT_SO_SET_ENTRIES), was able to OOM the > > > > > whole system from a restricted memcg, a potential DoS. > > > > > > > > What is the lifetime of these objects? Are they bound to any process? > > > > > > No, they are not. > > > They are free'd only when userspace requests it or the netns is > > > destroyed. > > > > Then this is problematic, because the oom killer is not able to > > guarantee the hard limit and so the excessive memory consumption cannot > > be really contained. As a result the memcg will be basically useless > > until somebody tears down the charged objects by other means. The memcg > > oom killer will surely kill all the existing tasks in the cgroup and > > this could somehow reduce the problem. Maybe this is sufficient for > > some usecases but that should be properly analyzed and described in the > > changelog. > > > > Can you explain why you think the memcg hard limit will not be > enforced? From what I understand, the memcg oom-killer will kill the > allocating processes as you have mentioned. We do force charging for > very limited conditions but here the memcg oom-killer will take care > ofI was talking about the force charge part. Depending on a specific allocation and its life time this can gradually get us over hard limit without any bound theoretically.> Anyways, the kernel is already charging the memory for > [ip,ip6,arp]_tables and this patch adds the charging for ebtables. > Without this patch, as Kirill has described and shown by syzbot, a low > priority memcg can force system OOM.I am not opposing the patch per-se. I would just like the changelog to be more descriptive about the life time and consequences. -- Michal Hocko SUSE Labs