thr3ads.net - Nouveau - [Nouveau] [RFC][PATCH] kernel.h: Add generic roundup

If this information is useful, please help other people find it:
Share via:

Steven Rostedt

2019-May-23 14:00 UTC

[Nouveau] [RFC][PATCH] kernel.h: Add generic roundup_64() macro

From: Steven Rostedt (VMware) <rostedt at goodmis.org>

In discussing a build failure on x86_32 due to the use of roundup() on
a 64 bit number, I realized that there's no generic equivalent
roundup_64(). It is implemented in two separate places in the kernel,
but there really should be just one that all can use.

Although the other implementations are a static inline function, this
implementation is a macro to allow the use of typeof(x) to denote the
type that is being used. If the build is on a 64 bit machine, then the
roundup_64() macro will just default back to roundup(). But for 32 bit
machines, it will use the version that is will not cause issues with
dividing a 64 bit number on a 32 bit machine.

Link: http://lkml.kernel.org/r/20190522145450.25ff483d at gandalf.local.home

Signed-off-by: Steven Rostedt (VMware) <rostedt at goodmis.org>
---
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c
b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 34a998012bf6..cdacfe1f732c 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -143,14 +143,6 @@ nouveau_bo_del_ttm(struct ttm_buffer_object *bo)
 	kfree(nvbo);
 }
 
-static inline u64
-roundup_64(u64 x, u32 y)
-{
-	x += y - 1;
-	do_div(x, y);
-	return x * y;
-}
-
 static void
 nouveau_bo_fixup_align(struct nouveau_bo *nvbo, u32 flags,
 		       int *align, u64 *size)
diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h
index edbd5a210df2..13de9d49bd52 100644
--- a/fs/xfs/xfs_linux.h
+++ b/fs/xfs/xfs_linux.h
@@ -207,13 +207,6 @@ static inline xfs_dev_t linux_to_xfs_dev_t(dev_t dev)
 #define xfs_sort(a,n,s,fn)	sort(a,n,s,fn,NULL)
 #define xfs_stack_trace()	dump_stack()
 
-static inline uint64_t roundup_64(uint64_t x, uint32_t y)
-{
-	x += y - 1;
-	do_div(x, y);
-	return x * y;
-}
-
 static inline uint64_t howmany_64(uint64_t x, uint32_t y)
 {
 	x += y - 1;
diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 74b1ee9027f5..cd0063629357 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -115,6 +115,20 @@
 	(((x) + (__y - 1)) / __y) * __y;		\
 }							\
 )
+
+#if BITS_PER_LONG == 32
+# define roundup_64(x, y) (				\
+{							\
+	typeof(y) __y = y;				\
+	typeof(x) __x = (x) + (__y - 1);		\
+	do_div(__x, __y);				\
+	__x * __y;					\
+}							\
+)
+#else
+# define roundup_64(x, y)	roundup(x, y)
+#endif
+
 /**
  * rounddown - round down to next specified multiple
  * @x: the value to round

Linus Torvalds

2019-May-23 15:10 UTC

head link

[Nouveau] [RFC][PATCH] kernel.h: Add generic roundup_64() macro

On Thu, May 23, 2019 at 7:00 AM Steven Rostedt <rostedt at goodmis.org>
wrote:>
> +# define roundup_64(x, y) (                            \
> +{                                                      \
> +       typeof(y) __y = y;                              \
> +       typeof(x) __x = (x) + (__y - 1);                \
> +       do_div(__x, __y);                               \
> +       __x * __y;                                      \
> +}                                                      \
The thing about this is that it absolutely sucks for power-of-two arguments.

The regular roundup() that uses division has the compiler at least
optimize them to shifts - at least for constant cases. But do_div() is
meant for "we already know it's not a power of two", and the
compiler
doesn't have any understanding of the internals.

And it looks to me like the use case you want this for is very much
probably a power of two. In which case division is all kinds of just
stupid.

And we already have a power-of-two round up function that works on
u64. It's called "round_up()".

I wish we had a better visual warning about the differences between
"round_up()" (limited to powers-of-two, but efficient, and works with
any size) and "roundup()" (generic, potentially horribly slow, and
doesn't work for 64-bit on 32-bit).

Side note: "round_up()" has the problem that it uses "x"
twice.

End result: somebody should look at this, but I really don't like the
"force division" case that is likely horribly slow and nasty.

                  Linus

Steven Rostedt

2019-May-23 15:27 UTC

head link

[Nouveau] [RFC][PATCH] kernel.h: Add generic roundup_64() macro

On Thu, 23 May 2019 08:10:44 -0700
Linus Torvalds <torvalds at linux-foundation.org> wrote:
> On Thu, May 23, 2019 at 7:00 AM Steven Rostedt <rostedt at
goodmis.org> wrote:
> >
> > +# define roundup_64(x, y) (                            \
> > +{                                                      \
> > +       typeof(y) __y = y;                              \
> > +       typeof(x) __x = (x) + (__y - 1);                \
> > +       do_div(__x, __y);                               \
> > +       __x * __y;                                      \
> > +}                                                      \
> 
> The thing about this is that it absolutely sucks for power-of-two
arguments.
> 
> The regular roundup() that uses division has the compiler at least
> optimize them to shifts - at least for constant cases. But do_div() is
> meant for "we already know it's not a power of two", and the
compiler
> doesn't have any understanding of the internals.
> 
> And it looks to me like the use case you want this for is very much
> probably a power of two. In which case division is all kinds of just
> stupid.
> 
> And we already have a power-of-two round up function that works on
> u64. It's called "round_up()".
> 
> I wish we had a better visual warning about the differences between
> "round_up()" (limited to powers-of-two, but efficient, and works
with
> any size) and "roundup()" (generic, potentially horribly slow,
and
> doesn't work for 64-bit on 32-bit).
> 
> Side note: "round_up()" has the problem that it uses
"x" twice.
> 
> End result: somebody should look at this, but I really don't like the
> "force division" case that is likely horribly slow and nasty.
I haven't yet tested this, but what about something like the following:

# define roundup_64(x, y) (				\
{							\
	typeof(y) __y;					\
	typeof(x) __x;					\
							\
	if (__builtin_constant_p(y) &&			\
	    !(y & (y >> 1))) {				\
		__x = round_up(x, y);			\
	} else {					\
		__y = y;				\
		__x = (x) + (__y - 1);			\
		do_div(__x, __y);			\
		__x = __x * __y;			\
	}						\
	__x;						\
}							\
)

If the compiler knows enough that y is a power of two, it will use the
shift version. Otherwise, it doesn't know enough and would divide
regardless. Or perhaps forget about the constant check, and just force
the power of two check:

# define roundup_64(x, y) (				\
{							\
	typeof(y) __y = y;				\
	typeof(x) __x;					\
							\
	if (!(__y & (__y >> 1))) {			\
		__x = round_up(x, y);			\
	} else {					\
		__x = (x) + (__y - 1);			\
		do_div(__x, __y);			\
		__x = __x * __y;			\
	}						\
	__x;						\
}							\
)

This way even if the compiler doesn't know that this is a power of two,
it will still do the shift if y ends up being one.

-- Steve

Reasonably Related Threads

Search for more reasonably related threads

Nouveau - May 2019 - [RFC][PATCH] kernel.h: Add generic roundup_64() macro

[Nouveau] [RFC][PATCH] kernel.h: Add generic roundup_64() macro

[Nouveau] [RFC][PATCH] kernel.h: Add generic roundup_64() macro

[Nouveau] [RFC][PATCH] kernel.h: Add generic roundup_64() macro

Reasonably Related Threads