thr3ads.net - LARTC - [PATCH] HTB O(1) class lookup [Feb 2007]

If this information is useful, please help other people find it:
Share via:

Simon Lodal

2007-Feb-01 05:18 UTC

[PATCH] HTB O(1) class lookup

This patch changes HTB''s class storage from hash+lists to a two-level
linear
array, so it can do constant time (O(1)) class lookup by classid. It improves 
scalability for large number of classes.

Without the patch, ~14k htb classes can starve a Xeon-3.2 at only 15kpps, 
using most of it''s cycles traversing lists in htb_find(). The patch 
eliminates this problem, and has a measurable impact even with a few hundred 
classes.

Previously, scalability could be improved by increasing HTB_HSIZE, modify the 
hash function, and recompile, but this patch works for everyone without 
recompile and scales better too.

The patch is for 2.6.20-rc6, I have older ones for 2.6.18 and 2.6.19 if anyone 
is interested.

Signed-off-by: Simon Lodal <simonl@parknet.dk>


_______________________________________________
LARTC mailing list
LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/cgi-bin/mailman/listinfo/lartc

Patrick McHardy

2007-Feb-01 06:08 UTC

head link

Re: [PATCH] HTB O(1) class lookup

Simon Lodal wrote:> This patch changes HTB''s class storage from hash+lists to a
two-level linear
> array, so it can do constant time (O(1)) class lookup by classid. It
improves
> scalability for large number of classes.
> 
> Without the patch, ~14k htb classes can starve a Xeon-3.2 at only 15kpps, 
> using most of it''s cycles traversing lists in htb_find(). The
patch
> eliminates this problem, and has a measurable impact even with a few
hundred
> classes.
> 
> Previously, scalability could be improved by increasing HTB_HSIZE, modify
the
> hash function, and recompile, but this patch works for everyone without 
> recompile and scales better too.
I agree that the current fixed sized hashes (additionally quite
small by default) are a big problem with many classes, for all of
HTB/HFSC/CBQ. But I think your approach is a bit wasteful, with
unfortunately chosen classids 128 classes are enough to reach the
maximum memory usage of ~512kb (with 4k pages and 8 byte pointers).

I have a patch for HFSC which introduces dynamic resizing of the
class hash. I have planned to generalize it (similar to tcf_hashinfo)
and convert HTB and CBQ as well, which as a nice side effect will
allow to get rid of some duplicated code, like hash walking.

If you give me a few days I''ll try to finish and post it.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Simon Lodal

2007-Feb-01 07:08 UTC

head link

Re: [PATCH] HTB O(1) class lookup

On Thursday 01 February 2007 07:08, Patrick McHardy
wrote:> Simon Lodal wrote:
> > This patch changes HTB''s class storage from hash+lists to a
two-level
> > linear array, so it can do constant time (O(1)) class lookup by
classid.
> > It improves scalability for large number of classes.
> >
> > Without the patch, ~14k htb classes can starve a Xeon-3.2 at only
15kpps,
> > using most of it''s cycles traversing lists in htb_find(). The
patch
> > eliminates this problem, and has a measurable impact even with a few
> > hundred classes.
> >
> > Previously, scalability could be improved by increasing HTB_HSIZE,
modify
> > the hash function, and recompile, but this patch works for everyone
> > without recompile and scales better too.
>
> I agree that the current fixed sized hashes (additionally quite
> small by default) are a big problem with many classes, for all of
> HTB/HFSC/CBQ. But I think your approach is a bit wasteful, with
> unfortunately chosen classids 128 classes are enough to reach the
> maximum memory usage of ~512kb (with 4k pages and 8 byte pointers).
I think it is a non-issue since it does not happen in practice.

Generally there are two ways to assign classids:

1) Manually, used when you have few classes. People usually use 100, 101, 102, 
200, 201 etc (probably unaware that they are hex). With 4k pages and 32bit 
pointers, everything below classid 400 is within the first page, which covers 
most "few classes" examples you can find lying around.

2) Script generated, in practice this is required if you have many classes. 
The classid''s will then usually be forthrunning, at least in large
chunks,
which means minimal memory waste, and an optimal case for plain linear 
lookup; hashing them can only be wasteful.

> I have a patch for HFSC which introduces dynamic resizing of the
> class hash. I have planned to generalize it (similar to tcf_hashinfo)
> and convert HTB and CBQ as well, which as a nice side effect will
> allow to get rid of some duplicated code, like hash walking.
I have not been looking into HFSC and CBQ, was not aware that they have 
similar issues.

> If you give me a few days I''ll try to finish and post it.
Memory is generally not an issue, but CPU is, and you can not beat the CPU 
efficiency of plain array lookup (always faster, and constant time).

If anything, I would find it more relevant to use array lookup with dynamic 
adjustment of the array size (HTB_CLS_ARRAY_SIZE in my patch); start out 
small to waste less memory, increase up to PAGE_SIZE as needed.

But then, it is probably too much effort for the gain (a few kb''s in
machines
that should have plenty of RAM anyway), and requires more code => more 
complexity, bugs, maintenance.

Regards
Simon
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Andi Kleen

2007-Feb-01 11:30 UTC

head link

Re: [PATCH] HTB O(1) class lookup

Simon Lodal <simonl@parknet.dk> writes:> 
> Memory is generally not an issue, but CPU is, and you can not beat the CPU 
> efficiency of plain array lookup (always faster, and constant time).
Actually that''s not true when the array doesn''t fit in cache.

The cost of going out to memory over caches is so large (factor 100 and more) 
that often algorithms with smaller cache footprint can easily beat
algorithms that execute much less instructions if it has less cache misses. 
That is because not all instructions have the same cost; anything
in core is very fast but going out to memory is very slow.

That said I think I agree with your analysis that a two level
array is probably the right data structure for this and likely
not less cache efficient than the hash table.

And the worst memory consumption case considered by Patrick should
be relatively unlikely.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

jamal

2007-Feb-01 13:06 UTC

head link

Re: [PATCH] HTB O(1) class lookup

On Thu, 2007-01-02 at 07:08 +0100, Patrick McHardy wrote:
> 
> I have a patch for HFSC which introduces dynamic resizing of the
> class hash. 
One thing that has bitten me recently was tests to try and see how far i
can go insert xfrm SAD/SPDs - the resizing of the hashes kept allocing
more and more space until i ran out of memory, then swap took over and
hell broke loose. It would be nice in your approach to keep a
configurable upper bound on how much mem a hash table can chew.
> I have planned to generalize it (similar to tcf_hashinfo)
> and convert HTB and CBQ as well, which as a nice side effect will
> allow to get rid of some duplicated code, like hash walking.
> 
You know what would be really nice is a generic piece of code that would
apply for all sorts of netcode that uses hashes (theres a huge amount of
such code) and then converting over slowly all users to it: All
attributes to such hashes are known, max-size, hash() etc. The
tcf_hashinfo is a good start template for such an effort.

cheers,
jamal

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Konrad Cempura

2007-Feb-01 22:44 UTC

head link

Re: [PATCH] HTB O(1) class lookup

Simon Lodal napisał(a):> The patch is for 2.6.20-rc6, I have older ones for 2.6.18 and 2.6.19 if
anyone
> is interested.
It''s working also on 2.6.20-rc7.

I''m testing it and I''m impressed. Good work :)

Jarek Poplawski

2007-Feb-05 10:16 UTC

head link

Re: [PATCH] HTB O(1) class lookup

On 01-02-2007 12:30, Andi Kleen wrote:> Simon Lodal <simonl@parknet.dk> writes:
>> Memory is generally not an issue, but CPU is, and you can not beat the
CPU
>> efficiency of plain array lookup (always faster, and constant time).
Probably for some old (or embedded) lean boxes used for
small network routers, with memory hungry iptables -
memory could be an issue.
> Actually that''s not true when the array doesn''t fit in
cache.
> 
> The cost of going out to memory over caches is so large (factor 100 and
more)
> that often algorithms with smaller cache footprint can easily beat
> algorithms that execute much less instructions if it has less cache misses.
> That is because not all instructions have the same cost; anything
> in core is very fast but going out to memory is very slow.
> 
> That said I think I agree with your analysis that a two level
> array is probably the right data structure for this and likely
> not less cache efficient than the hash table.
Strange - it seems you gave only arguments against this
analysis...
> And the worst memory consumption case considered by Patrick should
> be relatively unlikely.
Anyway, such approach, that most users do something
this (reasonable) way, doesn''t look like good
programming practice.

I wonder, why not try, at least for a while, to do this
a compile (menuconfig) option with a comment:
recommended for a large number of classes. After hash
optimization and some testing, final decisions could be
made.

Regards,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Andi Kleen

2007-Feb-05 11:24 UTC

head link

Re: [PATCH] HTB O(1) class lookup

On Monday 05 February 2007 11:16, Jarek Poplawski wrote:
> 
> Strange - it seems you gave only arguments against this
> analysis...
For a naturally clustered key space (as is common in this case) the two 
level structure is likely more cache efficient than a generic hash function. 
That is because the hash will likely spread out the natural clusters and then
require
more cache lines to access them because there will be less sharing.

Ok in theory a very tuned for this case hash function might have similar 
properties, but normally people don''t put that much care into 
designing hashes and just use some generic one.
> > And the worst memory consumption case considered by Patrick should
> > be relatively unlikely.
> 
> Anyway, such approach, that most users do something
> this (reasonable) way, doesn''t look like good
> programming practice.
In the unlikely worst case they will get half a MB of tables. Hardly a 
show stopper. 
> I wonder, why not try, at least for a while, to do this
> a compile (menuconfig) option with a comment:
> recommended for a large number of classes. After hash
> optimization and some testing, final decisions could be
> made.
There are already far too many obscure CONFIGs. Don''t add more.

-Andi
 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Ingo Oeser

2007-Feb-05 12:45 UTC

head link

Re: [PATCH] HTB O(1) class lookup

Andi Kleen schrieb:> On Monday 05 February 2007 11:16, Jarek Poplawski wrote:
> > I wonder, why not try, at least for a while, to do this
> > a compile (menuconfig) option with a comment:
> > recommended for a large number of classes. After hash
> > optimization and some testing, final decisions could be
> > made.
> 
> There are already far too many obscure CONFIGs. Don''t add more.
And for people concerned about memory usage: There is always CONFIG_SMALL
for such decisions. Maybe one can limit worst case and average memory usage 
based on this config. The algorithm should stay the same.

Regards

Ingo Oeser
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Simon Lodal

2007-Feb-05 17:14 UTC

head link

Re: [PATCH] HTB O(1) class lookup

On Monday 05 February 2007 11:16, Jarek Poplawski wrote:> On 01-02-2007 12:30, Andi Kleen wrote:
> > Simon Lodal <simonl@parknet.dk> writes:
> >> Memory is generally not an issue, but CPU is, and you can not beat
the
> >> CPU efficiency of plain array lookup (always faster, and constant
time).
>
> Probably for some old (or embedded) lean boxes used for
> small network routers, with memory hungry iptables -
> memory could be an issue.
Sure, but if they are that constrained they probably do not run HTB in the 
first place.

We are talking about 4k initially, up to 256k worst case (or 512k if your 
router is 64bit, unlikely if "small" is a priority).
> > And the worst memory consumption case considered by Patrick should
> > be relatively unlikely.
>
> Anyway, such approach, that most users do something
> this (reasonable) way, doesn''t look like good
> programming practice.
The current hash algorithm also assumes certain usage patterns, namely that 
you choose classids that generate different hash keys (= distribute uniformly 
across the buckets), or scalability will suffer very quickly. Even at 64 
classes you would probably see htb_find() near the top of a profiling 
analysis.

But I would say that it is just as unlikely as choosing 64 classids that cause 
my patch to allocate all 256k.

In these unlikely cases, my patch only wastes passive memory, while the 
current htb wastes cpu to a point where it can severely limit routing 
performance.

> I wonder, why not try, at least for a while, to do this
> a compile (menuconfig) option with a comment:
> recommended for a large number of classes. After hash
> optimization and some testing, final decisions could be
> made.
I decided not to do it because it would mean too many ifdefs 
(ugly+unmaintanable code).


Regards
Simon
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Simon Lodal

2007-Feb-05 18:21 UTC

head link

Re: [PATCH] HTB O(1) class lookup

On Thursday 01 February 2007 12:30, Andi Kleen wrote:> Simon Lodal <simonl@parknet.dk> writes:
> > Memory is generally not an issue, but CPU is, and you can not beat the
> > CPU efficiency of plain array lookup (always faster, and constant
time).
>
> Actually that''s not true when the array doesn''t fit in
cache.
>
> The cost of going out to memory over caches is so large (factor 100 and
> more) that often algorithms with smaller cache footprint can easily beat
> algorithms that execute much less instructions if it has less cache misses.
> That is because not all instructions have the same cost; anything
> in core is very fast but going out to memory is very slow.
>
> That said I think I agree with your analysis that a two level
> array is probably the right data structure for this and likely
> not less cache efficient than the hash table.
Good point.

The 2-level lookup generates 3 memory accesses (including getting at the 
htb_class struct). List traversal will generate many more memory accesses, 
unless the lists have 3 or fewer entries (currently that only holds true for 
up to 48 classes, uniformly distributed).

It is difficult to judge if the tables will be in cache or not. The tables are 
clearly extra baggage for the cachelines, compared to only having the 
htb_class structs (they are going to be fetched anyway).

On the other hand, if you have 10k classes, they are usually (real world) 
allocated for individual users, of which at most half are active at any time. 
With hashing, all 10k classes are fetched into cachelines all the time, only 
in order to traverse lists. That is >150k wasted cache (5000 x 32 bytes); 
plenty for 10k pointers in lookup tables.

Regards
Simon
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jarek Poplawski

2007-Feb-06 08:08 UTC

head link

Re: [PATCH] HTB O(1) class lookup

On Mon, Feb 05, 2007 at 06:14:13PM +0100, Simon Lodal
wrote:> On Monday 05 February 2007 11:16, Jarek Poplawski wrote:
> > On 01-02-2007 12:30, Andi Kleen wrote:
> > > Simon Lodal <simonl@parknet.dk> writes:
> > >> Memory is generally not an issue, but CPU is, and you can not
beat the
> > >> CPU efficiency of plain array lookup (always faster, and
constant time).
> >
> > Probably for some old (or embedded) lean boxes used for
> > small network routers, with memory hungry iptables -
> > memory could be an issue.
> 
> Sure, but if they are that constrained they probably do not run HTB in the 
> first place.
> 
> We are talking about 4k initially, up to 256k worst case (or 512k if your 
> router is 64bit, unlikely if "small" is a priority).
> 
> > > And the worst memory consumption case considered by Patrick
should
> > > be relatively unlikely.
> >
> > Anyway, such approach, that most users do something
> > this (reasonable) way, doesn''t look like good
> > programming practice.
> 
> The current hash algorithm also assumes certain usage patterns, namely that
> you choose classids that generate different hash keys (= distribute
uniformly
> across the buckets), or scalability will suffer very quickly. Even at 64 
> classes you would probably see htb_find() near the top of a profiling 
> analysis.
> 
> But I would say that it is just as unlikely as choosing 64 classids that
cause
> my patch to allocate all 256k.
> 
> In these unlikely cases, my patch only wastes passive memory, while the 
> current htb wastes cpu to a point where it can severely limit routing 
> performance.
> 
> 
> > I wonder, why not try, at least for a while, to do this
> > a compile (menuconfig) option with a comment:
> > recommended for a large number of classes. After hash
> > optimization and some testing, final decisions could be
> > made.
> 
> I decided not to do it because it would mean too many ifdefs 
> (ugly+unmaintanable code).
As a matter of fact Andi''s recommentation is enough
for me. In his first message he wrote "probably the
right data structure for this", so I thought: why
not test and make sure. It should be easier without
removing current solution. But his second message
convinced me.

Generally I think 512k (or even 256k) should matter
and don''t agree HTB is not for constrained ones. It
could be dangerous attitude if every module in the
kernel were so "generous". And it could be contagious:
others don''t care - why should I?

Some time ago low memory requirements and possibility
to run on older boxes were strong arguments for linux.
Did we give it up to BSDs?

So I only wanted to make sure there would be a real
gain, because, for consistency, probably the same
model should be used with others (CBQ, HFSC).

Cheers,
Jarek P.
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jarek Poplawski

2007-Feb-08 07:36 UTC

head link

Re: [PATCH] HTB O(1) class lookup

On Mon, Feb 05, 2007 at 06:14:13PM +0100, Simon Lodal wrote:
...> Regards...

It seems decisions makers need more time, so I''d add 
2 cents more:

1c: an indentation could be improved (spaces around
operators), like in these places:
>+#define HTB_MAX_CLS		(TC_H_MIN(-1)+1)
...>+	htb_cls_array* a;
...>+	int cnt,done;
etc.

2c: it is a question of taste, but here:
> 		err = -ENOBUFS;
>+		if (q->classes[HTB_CLS_ARRAY(minorid)] == NULL) {
>+			if ((q->classes[HTB_CLS_ARRAY(minorid)] = 
>+			     kzalloc(sizeof(htb_cls_array), GFP_KERNEL))
>+			    == NULL)
>+				goto failure;
>+		}
> 		if ((cl = kzalloc(sizeof(*cl), GFP_KERNEL)) == NULL)
> 			goto failure;
it would be probably more readable and a bit merciful
to the stressed system to free this htb_cls_array after
the last error (I know it''s not a leak).

Regards,
Jarek P.

PS: 1c extra - it''s easier to read a diff if you use -p option. 
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Possibly Parallel Threads

Search for more apparently analagous threads

LARTC - Feb 2007 - [PATCH] HTB O(1) class lookup

[PATCH] HTB O(1) class lookup

Re: [PATCH] HTB O(1) class lookup

Re: [PATCH] HTB O(1) class lookup

Re: [PATCH] HTB O(1) class lookup

Re: [PATCH] HTB O(1) class lookup

Re: [PATCH] HTB O(1) class lookup

Re: [PATCH] HTB O(1) class lookup

Re: [PATCH] HTB O(1) class lookup

Re: [PATCH] HTB O(1) class lookup

Re: [PATCH] HTB O(1) class lookup

Re: [PATCH] HTB O(1) class lookup

Re: [PATCH] HTB O(1) class lookup

Re: [PATCH] HTB O(1) class lookup

Possibly Parallel Threads