Michal Hocko
2018-Dec-30 07:45 UTC
[Bridge] [PATCH] netfilter: account ebt_table_info to kmemcg
On Sat 29-12-18 11:34:29, Shakeel Butt wrote:> On Sat, Dec 29, 2018 at 2:06 AM Michal Hocko <mhocko at kernel.org> wrote: > > > > On Sat 29-12-18 10:52:15, Florian Westphal wrote: > > > Michal Hocko <mhocko at kernel.org> wrote: > > > > On Fri 28-12-18 17:55:24, Shakeel Butt wrote: > > > > > The [ip,ip6,arp]_tables use x_tables_info internally and the underlying > > > > > memory is already accounted to kmemcg. Do the same for ebtables. The > > > > > syzbot, by using setsockopt(EBT_SO_SET_ENTRIES), was able to OOM the > > > > > whole system from a restricted memcg, a potential DoS. > > > > > > > > What is the lifetime of these objects? Are they bound to any process? > > > > > > No, they are not. > > > They are free'd only when userspace requests it or the netns is > > > destroyed. > > > > Then this is problematic, because the oom killer is not able to > > guarantee the hard limit and so the excessive memory consumption cannot > > be really contained. As a result the memcg will be basically useless > > until somebody tears down the charged objects by other means. The memcg > > oom killer will surely kill all the existing tasks in the cgroup and > > this could somehow reduce the problem. Maybe this is sufficient for > > some usecases but that should be properly analyzed and described in the > > changelog. > > > > Can you explain why you think the memcg hard limit will not be > enforced? From what I understand, the memcg oom-killer will kill the > allocating processes as you have mentioned. We do force charging for > very limited conditions but here the memcg oom-killer will take care > ofI was talking about the force charge part. Depending on a specific allocation and its life time this can gradually get us over hard limit without any bound theoretically.> Anyways, the kernel is already charging the memory for > [ip,ip6,arp]_tables and this patch adds the charging for ebtables. > Without this patch, as Kirill has described and shown by syzbot, a low > priority memcg can force system OOM.I am not opposing the patch per-se. I would just like the changelog to be more descriptive about the life time and consequences. -- Michal Hocko SUSE Labs
Michal Hocko
2018-Dec-30 08:00 UTC
[Bridge] [PATCH] netfilter: account ebt_table_info to kmemcg
On Sun 30-12-18 08:45:13, Michal Hocko wrote:> On Sat 29-12-18 11:34:29, Shakeel Butt wrote: > > On Sat, Dec 29, 2018 at 2:06 AM Michal Hocko <mhocko at kernel.org> wrote: > > > > > > On Sat 29-12-18 10:52:15, Florian Westphal wrote: > > > > Michal Hocko <mhocko at kernel.org> wrote: > > > > > On Fri 28-12-18 17:55:24, Shakeel Butt wrote: > > > > > > The [ip,ip6,arp]_tables use x_tables_info internally and the underlying > > > > > > memory is already accounted to kmemcg. Do the same for ebtables. The > > > > > > syzbot, by using setsockopt(EBT_SO_SET_ENTRIES), was able to OOM the > > > > > > whole system from a restricted memcg, a potential DoS. > > > > > > > > > > What is the lifetime of these objects? Are they bound to any process? > > > > > > > > No, they are not. > > > > They are free'd only when userspace requests it or the netns is > > > > destroyed. > > > > > > Then this is problematic, because the oom killer is not able to > > > guarantee the hard limit and so the excessive memory consumption cannot > > > be really contained. As a result the memcg will be basically useless > > > until somebody tears down the charged objects by other means. The memcg > > > oom killer will surely kill all the existing tasks in the cgroup and > > > this could somehow reduce the problem. Maybe this is sufficient for > > > some usecases but that should be properly analyzed and described in the > > > changelog. > > > > > > > Can you explain why you think the memcg hard limit will not be > > enforced? From what I understand, the memcg oom-killer will kill the > > allocating processes as you have mentioned. We do force charging for > > very limited conditions but here the memcg oom-killer will take care > > of > > I was talking about the force charge part. Depending on a specific > allocation and its life time this can gradually get us over hard limit > without any bound theoretically.Forgot to mention. Since b8c8a338f75e ("Revert "vmalloc: back off when the current task is killed"") there is no way to bail out from the vmalloc allocation loop so if the request is really large then the memcg oom will not help. Is that a problem here? Maybe it is time to revisit fatal_signal_pending check. -- Michal Hocko SUSE Labs
Shakeel Butt
2018-Dec-31 04:00 UTC
[Bridge] [PATCH] netfilter: account ebt_table_info to kmemcg
On Sat, Dec 29, 2018 at 11:45 PM Michal Hocko <mhocko at kernel.org> wrote:> > On Sat 29-12-18 11:34:29, Shakeel Butt wrote: > > On Sat, Dec 29, 2018 at 2:06 AM Michal Hocko <mhocko at kernel.org> wrote: > > > > > > On Sat 29-12-18 10:52:15, Florian Westphal wrote: > > > > Michal Hocko <mhocko at kernel.org> wrote: > > > > > On Fri 28-12-18 17:55:24, Shakeel Butt wrote: > > > > > > The [ip,ip6,arp]_tables use x_tables_info internally and the underlying > > > > > > memory is already accounted to kmemcg. Do the same for ebtables. The > > > > > > syzbot, by using setsockopt(EBT_SO_SET_ENTRIES), was able to OOM the > > > > > > whole system from a restricted memcg, a potential DoS. > > > > > > > > > > What is the lifetime of these objects? Are they bound to any process? > > > > > > > > No, they are not. > > > > They are free'd only when userspace requests it or the netns is > > > > destroyed. > > > > > > Then this is problematic, because the oom killer is not able to > > > guarantee the hard limit and so the excessive memory consumption cannot > > > be really contained. As a result the memcg will be basically useless > > > until somebody tears down the charged objects by other means. The memcg > > > oom killer will surely kill all the existing tasks in the cgroup and > > > this could somehow reduce the problem. Maybe this is sufficient for > > > some usecases but that should be properly analyzed and described in the > > > changelog. > > > > > > > Can you explain why you think the memcg hard limit will not be > > enforced? From what I understand, the memcg oom-killer will kill the > > allocating processes as you have mentioned. We do force charging for > > very limited conditions but here the memcg oom-killer will take care > > of > > I was talking about the force charge part. Depending on a specific > allocation and its life time this can gradually get us over hard limit > without any bound theoretically. > > > Anyways, the kernel is already charging the memory for > > [ip,ip6,arp]_tables and this patch adds the charging for ebtables. > > Without this patch, as Kirill has described and shown by syzbot, a low > > priority memcg can force system OOM. > > I am not opposing the patch per-se. I would just like the changelog to > be more descriptive about the life time and consequences. > --I will resend the patch with more detailed change log. thanks, Shakeel