Ilia Mirkin
2014-Feb-09 20:51 UTC
[Nouveau] [PATCH 1/2] drm/nouveau: replace ffsll with __ffs64
The ffsll function is a lot slower than the __ffs64 built-in which
compiles to a single instruction on 64-bit. It's also nice to avoid
custom versions of standard functions. Note that __ffs == ffs - 1.
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
---
I wrote a user-space program to test these out and make sure that the
functions behaved as expected. The logic in abi16 had to be flipped around a
bit since __ffs doesn't distinguish between 0 and 1. There's a minor
difference in that init->channel is going to get returned as 0 for ENOSPC vs
-1, but I can't imagine that'd matter.
drivers/gpu/drm/nouveau/core/core/parent.c | 2 +-
drivers/gpu/drm/nouveau/core/os.h | 11 -----------
drivers/gpu/drm/nouveau/nouveau_abi16.c | 4 ++--
3 files changed, 3 insertions(+), 14 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/core/core/parent.c
b/drivers/gpu/drm/nouveau/core/core/parent.c
index 313380c..dee5d12 100644
--- a/drivers/gpu/drm/nouveau/core/core/parent.c
+++ b/drivers/gpu/drm/nouveau/core/core/parent.c
@@ -49,7 +49,7 @@ nouveau_parent_sclass(struct nouveau_object *parent, u16
handle,
mask = nv_parent(parent)->engine;
while (mask) {
- int i = ffsll(mask) - 1;
+ int i = __ffs64(mask);
if (nv_iclass(parent, NV_CLIENT_CLASS))
engine = nv_engine(nv_client(parent)->device);
diff --git a/drivers/gpu/drm/nouveau/core/os.h
b/drivers/gpu/drm/nouveau/core/os.h
index 191e739..3cd6120 100644
--- a/drivers/gpu/drm/nouveau/core/os.h
+++ b/drivers/gpu/drm/nouveau/core/os.h
@@ -23,17 +23,6 @@
#include <asm/unaligned.h>
-static inline int
-ffsll(u64 mask)
-{
- int i;
- for (i = 0; i < 64; i++) {
- if (mask & (1ULL << i))
- return i + 1;
- }
- return 0;
-}
-
#ifndef ioread32_native
#ifdef __BIG_ENDIAN
#define ioread16_native ioread16be
diff --git a/drivers/gpu/drm/nouveau/nouveau_abi16.c
b/drivers/gpu/drm/nouveau/nouveau_abi16.c
index 900fae0..b701117 100644
--- a/drivers/gpu/drm/nouveau/nouveau_abi16.c
+++ b/drivers/gpu/drm/nouveau/nouveau_abi16.c
@@ -270,8 +270,8 @@ nouveau_abi16_ioctl_channel_alloc(ABI16_IOCTL_ARGS)
return nouveau_abi16_put(abi16, -EINVAL);
/* allocate "abi16 channel" data and make up a handle for it */
- init->channel = ffsll(~abi16->handles);
- if (!init->channel--)
+ init->channel = __ffs64(~abi16->handles);
+ if (~abi16->handles == 0)
return nouveau_abi16_put(abi16, -ENOSPC);
chan = kzalloc(sizeof(*chan), GFP_KERNEL);
--
1.8.3.2
Ilia Mirkin
2014-Feb-09 20:51 UTC
[Nouveau] [PATCH 2/2] drm/nouveau/abi16: fix handles past the 32nd one
abi16->handles is a u64, so make sure to use 1ULL << val when
modifying.
Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>
---
Noticed this when doing the previous patch. I'm not sure whether this
affects
64-bit builds or not, didn't care to look at the assembly or check the
standard.
drivers/gpu/drm/nouveau/nouveau_abi16.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_abi16.c
b/drivers/gpu/drm/nouveau/nouveau_abi16.c
index b701117..66abf4d 100644
--- a/drivers/gpu/drm/nouveau/nouveau_abi16.c
+++ b/drivers/gpu/drm/nouveau/nouveau_abi16.c
@@ -139,7 +139,7 @@ nouveau_abi16_chan_fini(struct nouveau_abi16 *abi16,
/* destroy channel object, all children will be killed too */
if (chan->chan) {
- abi16->handles &= ~(1 << (chan->chan->handle &
0xffff));
+ abi16->handles &= ~(1ULL << (chan->chan->handle &
0xffff));
nouveau_channel_del(&chan->chan);
}
@@ -280,7 +280,7 @@ nouveau_abi16_ioctl_channel_alloc(ABI16_IOCTL_ARGS)
INIT_LIST_HEAD(&chan->notifiers);
list_add(&chan->head, &abi16->channels);
- abi16->handles |= (1 << init->channel);
+ abi16->handles |= (1ULL << init->channel);
/* create channel object and initialise dma and fence management */
ret = nouveau_channel_new(drm, cli, NVDRM_DEVICE, NVDRM_CHAN |
--
1.8.3.2
Ben Skeggs
2014-Feb-10 04:26 UTC
[Nouveau] [PATCH 1/2] drm/nouveau: replace ffsll with __ffs64
On Mon, Feb 10, 2014 at 6:51 AM, Ilia Mirkin <imirkin at alum.mit.edu> wrote:> The ffsll function is a lot slower than the __ffs64 built-in which > compiles to a single instruction on 64-bit. It's also nice to avoid > custom versions of standard functions. Note that __ffs == ffs - 1. > > Signed-off-by: Ilia Mirkin <imirkin at alum.mit.edu>Merged both patches in the series. Thanks.> --- > > I wrote a user-space program to test these out and make sure that the > functions behaved as expected. The logic in abi16 had to be flipped around a > bit since __ffs doesn't distinguish between 0 and 1. There's a minor > difference in that init->channel is going to get returned as 0 for ENOSPC vs > -1, but I can't imagine that'd matter. > > drivers/gpu/drm/nouveau/core/core/parent.c | 2 +- > drivers/gpu/drm/nouveau/core/os.h | 11 ----------- > drivers/gpu/drm/nouveau/nouveau_abi16.c | 4 ++-- > 3 files changed, 3 insertions(+), 14 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/core/core/parent.c b/drivers/gpu/drm/nouveau/core/core/parent.c > index 313380c..dee5d12 100644 > --- a/drivers/gpu/drm/nouveau/core/core/parent.c > +++ b/drivers/gpu/drm/nouveau/core/core/parent.c > @@ -49,7 +49,7 @@ nouveau_parent_sclass(struct nouveau_object *parent, u16 handle, > > mask = nv_parent(parent)->engine; > while (mask) { > - int i = ffsll(mask) - 1; > + int i = __ffs64(mask); > > if (nv_iclass(parent, NV_CLIENT_CLASS)) > engine = nv_engine(nv_client(parent)->device); > diff --git a/drivers/gpu/drm/nouveau/core/os.h b/drivers/gpu/drm/nouveau/core/os.h > index 191e739..3cd6120 100644 > --- a/drivers/gpu/drm/nouveau/core/os.h > +++ b/drivers/gpu/drm/nouveau/core/os.h > @@ -23,17 +23,6 @@ > > #include <asm/unaligned.h> > > -static inline int > -ffsll(u64 mask) > -{ > - int i; > - for (i = 0; i < 64; i++) { > - if (mask & (1ULL << i)) > - return i + 1; > - } > - return 0; > -} > - > #ifndef ioread32_native > #ifdef __BIG_ENDIAN > #define ioread16_native ioread16be > diff --git a/drivers/gpu/drm/nouveau/nouveau_abi16.c b/drivers/gpu/drm/nouveau/nouveau_abi16.c > index 900fae0..b701117 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_abi16.c > +++ b/drivers/gpu/drm/nouveau/nouveau_abi16.c > @@ -270,8 +270,8 @@ nouveau_abi16_ioctl_channel_alloc(ABI16_IOCTL_ARGS) > return nouveau_abi16_put(abi16, -EINVAL); > > /* allocate "abi16 channel" data and make up a handle for it */ > - init->channel = ffsll(~abi16->handles); > - if (!init->channel--) > + init->channel = __ffs64(~abi16->handles); > + if (~abi16->handles == 0) > return nouveau_abi16_put(abi16, -ENOSPC); > > chan = kzalloc(sizeof(*chan), GFP_KERNEL); > -- > 1.8.3.2 > > _______________________________________________ > dri-devel mailing list > dri-devel at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel
Apparently Analagous Threads
- [PATCH] drm/nouveau: idle all channels before suspending
- [PATCH 0/2] drm/nouveau: Use more standard logging styles
- [PATCH 2/2] drm/nouveau/abi16: fix handles past the 32nd one
- [PATCH 1/2] drm/nouveau: hold mutex while syncing to kernel channel
- [PATCH] drm/nouveau: fix handling empty channel list in ioctl's