thr3ads.net - Linux Ethernet Bridging - [Bridge] [PATCH] netfilter: account ebt_table

If this information is useful, please help other people find it:
Share via:

Shakeel Butt

2018-Dec-31 03:59 UTC

[Bridge] [PATCH] netfilter: account ebt_table_info to kmemcg

On Sun, Dec 30, 2018 at 12:00 AM Michal Hocko <mhocko at kernel.org>
wrote:>
> On Sun 30-12-18 08:45:13, Michal Hocko wrote:
> > On Sat 29-12-18 11:34:29, Shakeel Butt wrote:
> > > On Sat, Dec 29, 2018 at 2:06 AM Michal Hocko <mhocko at
kernel.org> wrote:
> > > >
> > > > On Sat 29-12-18 10:52:15, Florian Westphal wrote:
> > > > > Michal Hocko <mhocko at kernel.org> wrote:
> > > > > > On Fri 28-12-18 17:55:24, Shakeel Butt wrote:
> > > > > > > The [ip,ip6,arp]_tables use x_tables_info
internally and the underlying
> > > > > > > memory is already accounted to kmemcg. Do the
same for ebtables. The
> > > > > > > syzbot, by using
setsockopt(EBT_SO_SET_ENTRIES), was able to OOM the
> > > > > > > whole system from a restricted memcg, a
potential DoS.
> > > > > >
> > > > > > What is the lifetime of these objects? Are they
bound to any process?
> > > > >
> > > > > No, they are not.
> > > > > They are free'd only when userspace requests it or
the netns is
> > > > > destroyed.
> > > >
> > > > Then this is problematic, because the oom killer is not able
to
> > > > guarantee the hard limit and so the excessive memory
consumption cannot
> > > > be really contained. As a result the memcg will be basically
useless
> > > > until somebody tears down the charged objects by other
means. The memcg
> > > > oom killer will surely kill all the existing tasks in the
cgroup and
> > > > this could somehow reduce the problem. Maybe this is
sufficient for
> > > > some usecases but that should be properly analyzed and
described in the
> > > > changelog.
> > > >
> > >
> > > Can you explain why you think the memcg hard limit will not be
> > > enforced? From what I understand, the memcg oom-killer will kill
the
> > > allocating processes as you have mentioned. We do force charging
for
> > > very limited conditions but here the memcg oom-killer will take
care
> > > of
> >
> > I was talking about the force charge part. Depending on a specific
> > allocation and its life time this can gradually get us over hard limit
> > without any bound theoretically.
>
> Forgot to mention. Since b8c8a338f75e ("Revert "vmalloc: back off
when
> the current task is killed"") there is no way to bail out from
the
> vmalloc allocation loop so if the request is really large then the memcg
> oom will not help. Is that a problem here?
>
Yes, I think it will be an issue here.
> Maybe it is time to revisit fatal_signal_pending check.
Yes, we will need something to handle the memcg OOM. I will think more
on that front or if you have any ideas, please do propose.

thanks,
Shakeel

Michal Hocko

2018-Dec-31 10:11 UTC

head link

[Bridge] [PATCH] netfilter: account ebt_table_info to kmemcg

On Sun 30-12-18 19:59:53, Shakeel Butt wrote:> On Sun, Dec 30, 2018 at 12:00 AM Michal Hocko <mhocko at kernel.org>
wrote:
> >
> > On Sun 30-12-18 08:45:13, Michal Hocko wrote:
> > > On Sat 29-12-18 11:34:29, Shakeel Butt wrote:
> > > > On Sat, Dec 29, 2018 at 2:06 AM Michal Hocko <mhocko at
kernel.org> wrote:
> > > > >
> > > > > On Sat 29-12-18 10:52:15, Florian Westphal wrote:
> > > > > > Michal Hocko <mhocko at kernel.org> wrote:
> > > > > > > On Fri 28-12-18 17:55:24, Shakeel Butt wrote:
> > > > > > > > The [ip,ip6,arp]_tables use
x_tables_info internally and the underlying
> > > > > > > > memory is already accounted to kmemcg.
Do the same for ebtables. The
> > > > > > > > syzbot, by using
setsockopt(EBT_SO_SET_ENTRIES), was able to OOM the
> > > > > > > > whole system from a restricted memcg, a
potential DoS.
> > > > > > >
> > > > > > > What is the lifetime of these objects? Are
they bound to any process?
> > > > > >
> > > > > > No, they are not.
> > > > > > They are free'd only when userspace requests
it or the netns is
> > > > > > destroyed.
> > > > >
> > > > > Then this is problematic, because the oom killer is not
able to
> > > > > guarantee the hard limit and so the excessive memory
consumption cannot
> > > > > be really contained. As a result the memcg will be
basically useless
> > > > > until somebody tears down the charged objects by other
means. The memcg
> > > > > oom killer will surely kill all the existing tasks in
the cgroup and
> > > > > this could somehow reduce the problem. Maybe this is
sufficient for
> > > > > some usecases but that should be properly analyzed and
described in the
> > > > > changelog.
> > > > >
> > > >
> > > > Can you explain why you think the memcg hard limit will not
be
> > > > enforced? From what I understand, the memcg oom-killer will
kill the
> > > > allocating processes as you have mentioned. We do force
charging for
> > > > very limited conditions but here the memcg oom-killer will
take care
> > > > of
> > >
> > > I was talking about the force charge part. Depending on a
specific
> > > allocation and its life time this can gradually get us over hard
limit
> > > without any bound theoretically.
> >
> > Forgot to mention. Since b8c8a338f75e ("Revert "vmalloc:
back off when
> > the current task is killed"") there is no way to bail out
from the
> > vmalloc allocation loop so if the request is really large then the
memcg
> > oom will not help. Is that a problem here?
> >
> 
> Yes, I think it will be an issue here.
> 
> > Maybe it is time to revisit fatal_signal_pending check.
> 
> Yes, we will need something to handle the memcg OOM. I will think more
> on that front or if you have any ideas, please do propose.
I can see three options here:
	- do not force charge on memcg oom or introduce a limited charge
	  overflow (reserves basically).
	- revert the revert and reintroduce the fatal_signal_pending
	  check into vmalloc
	- be more specific and check tsk_is_oom_victim in vmalloc and
	  fail

-- 
Michal Hocko
SUSE Labs

Shakeel Butt

2019-Jan-03 20:52 UTC

head link

[Bridge] [PATCH] netfilter: account ebt_table_info to kmemcg

On Mon, Dec 31, 2018 at 2:12 AM Michal Hocko <mhocko at kernel.org>
wrote:>
> On Sun 30-12-18 19:59:53, Shakeel Butt wrote:
> > On Sun, Dec 30, 2018 at 12:00 AM Michal Hocko <mhocko at
kernel.org> wrote:
> > >
> > > On Sun 30-12-18 08:45:13, Michal Hocko wrote:
> > > > On Sat 29-12-18 11:34:29, Shakeel Butt wrote:
> > > > > On Sat, Dec 29, 2018 at 2:06 AM Michal Hocko <mhocko
at kernel.org> wrote:
> > > > > >
> > > > > > On Sat 29-12-18 10:52:15, Florian Westphal wrote:
> > > > > > > Michal Hocko <mhocko at kernel.org>
wrote:
> > > > > > > > On Fri 28-12-18 17:55:24, Shakeel Butt
wrote:
> > > > > > > > > The [ip,ip6,arp]_tables use
x_tables_info internally and the underlying
> > > > > > > > > memory is already accounted to
kmemcg. Do the same for ebtables. The
> > > > > > > > > syzbot, by using
setsockopt(EBT_SO_SET_ENTRIES), was able to OOM the
> > > > > > > > > whole system from a restricted
memcg, a potential DoS.
> > > > > > > >
> > > > > > > > What is the lifetime of these objects?
Are they bound to any process?
> > > > > > >
> > > > > > > No, they are not.
> > > > > > > They are free'd only when userspace
requests it or the netns is
> > > > > > > destroyed.
> > > > > >
> > > > > > Then this is problematic, because the oom killer
is not able to
> > > > > > guarantee the hard limit and so the excessive
memory consumption cannot
> > > > > > be really contained. As a result the memcg will be
basically useless
> > > > > > until somebody tears down the charged objects by
other means. The memcg
> > > > > > oom killer will surely kill all the existing tasks
in the cgroup and
> > > > > > this could somehow reduce the problem. Maybe this
is sufficient for
> > > > > > some usecases but that should be properly analyzed
and described in the
> > > > > > changelog.
> > > > > >
> > > > >
> > > > > Can you explain why you think the memcg hard limit will
not be
> > > > > enforced? From what I understand, the memcg oom-killer
will kill the
> > > > > allocating processes as you have mentioned. We do force
charging for
> > > > > very limited conditions but here the memcg oom-killer
will take care
> > > > > of
> > > >
> > > > I was talking about the force charge part. Depending on a
specific
> > > > allocation and its life time this can gradually get us over
hard limit
> > > > without any bound theoretically.
> > >
> > > Forgot to mention. Since b8c8a338f75e ("Revert
"vmalloc: back off when
> > > the current task is killed"") there is no way to bail
out from the
> > > vmalloc allocation loop so if the request is really large then
the memcg
> > > oom will not help. Is that a problem here?
> > >
> >
> > Yes, I think it will be an issue here.
> >
> > > Maybe it is time to revisit fatal_signal_pending check.
> >
> > Yes, we will need something to handle the memcg OOM. I will think more
> > on that front or if you have any ideas, please do propose.
>
> I can see three options here:
>         - do not force charge on memcg oom or introduce a limited charge
>           overflow (reserves basically).
>         - revert the revert and reintroduce the fatal_signal_pending
>           check into vmalloc
>         - be more specific and check tsk_is_oom_victim in vmalloc and
>           fail
>
I think for the long term solution we might need something similar to
memcg oom reserves (1) but for quick fix I think we can do the
combination of (2) and (3).

Shakeel

Linux Ethernet Bridging - Jan 2019 - [Bridge] [PATCH] netfilter: account ebt_table_info to kmemcg

[Bridge] [PATCH] netfilter: account ebt_table_info to kmemcg

[Bridge] [PATCH] netfilter: account ebt_table_info to kmemcg

[Bridge] [PATCH] netfilter: account ebt_table_info to kmemcg