Displaying 20 results from an estimated 559 matches for "s64".
Did you mean:
64
2016 Apr 19
2
[PATCH v4 20/37] volt: add coefficients
...100644
> --- a/drm/nouveau/nvkm/subdev/volt/base.c
> +++ b/drm/nouveau/nvkm/subdev/volt/base.c
> @@ -110,13 +110,47 @@ nvkm_volt_map(struct nvkm_volt *volt, u8 id, u8 temp)
>
> vmap = nvbios_vmap_entry_parse(bios, id, &ver, &len, &info);
> if (vmap) {
> + s64 result;
> +
> + if (volt->speedo < 0)
> + return volt->speedo;
Hmm, so you will refuse reclocking if the speedo cannot be read...
Fair-enough, but I would like to see a warning in the kernel logs.
> +
> + if (ver == 0x10 || (ver == 0x20 && info.mode == 0)) {
&...
2018 Sep 21
2
[GlobalISel] Legalize generic instructions that also depend on type of scalar, not only scalar size
Hi,
Mips32 has 64 bit floating point instructions, while i64 instructions
have to be emulated with i32 instructions. This means that G_LOAD should
be custom legalized for s64 integer value, and be legal for s64 floating
point value. There are also other generic instructions with the same
problem: G_STORE, G_SELECT, G_EXTRACT, and G_INSERT.
There are also other configurations where integer and floating point
instructions of the same size are not simultaneously availa...
2016 Apr 18
0
[PATCH v4 20/37] volt: add coefficients
...dev/volt/base.c
index cecfac6..5e35d96 100644
--- a/drm/nouveau/nvkm/subdev/volt/base.c
+++ b/drm/nouveau/nvkm/subdev/volt/base.c
@@ -110,13 +110,47 @@ nvkm_volt_map(struct nvkm_volt *volt, u8 id, u8 temp)
vmap = nvbios_vmap_entry_parse(bios, id, &ver, &len, &info);
if (vmap) {
+ s64 result;
+
+ if (volt->speedo < 0)
+ return volt->speedo;
+
+ if (ver == 0x10 || (ver == 0x20 && info.mode == 0)) {
+ result = (s64)info.arg[0] / 10;
+ result += ((s64)info.arg[1] * volt->speedo) / 10;
+ result += ((s64)info.arg[2] * volt->speedo * volt->speedo)...
2014 Oct 24
3
[LLVMdev] IndVar widening in IndVarSimplify causing performance regression on GPU programs
...ut widening, the loop body in the PTX (a low-level assembly-like
language generated by NVPTX64) is:
BB0_2: // =>This Inner Loop Header:
Depth=1
mul.lo.s32 %r5, %r6, %r6;
st.u32 [%rd4], %r5;
add.s32 %r6, %r6, 3;
add.s64 %rd4, %rd4, 12;
setp.lt.s32 %p2, %r6, %r3;
@%p2 bra BB0_2;
in which %r6 is the induction variable i.
With widening, the loop body becomes:
BB0_2: // =>This Inner Loop Header:
Depth=1
mul.lo.s64 %rd8, %rd10, %rd10;...
2016 Sep 16
1
[PATCH] volt: use kernel's 64-bit signed division function
Doing direct 64 bit divisions in kernel code leads to references to
undefined symbols on 32 bit architectures. Replace such divisions with
calls to div64_s64 to make the module usable on 32 bit archs.
Signed-off-by: Alexandre Courbot <acourbot at nvidia.com>
---
drm/nouveau/nvkm/subdev/volt/base.c | 6 +++---
lib/include/nvif/os.h | 1 +
2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/drm/nouveau/nvkm/subdev/volt/bas...
2008 Aug 18
2
[PATCH] virtio_balloon: fix towards_target when deflating balloon
Both v and vb->num_pages are u32 and unsigned int respectively. If v is less
than vb->num_pages (and it is, when deflating the balloon), the result is a
very large 32-bit number. Since we're returning a s64, instead of getting the
same negative number we desire, we get a very large positive number.
This handles the case where v < vb->num_pages and ensures we get a small,
negative, s64 as the result.
Rusty: please push this for 2.6.27-rc4. It's probably appropriate for the
stable tree too...
2008 Aug 18
2
[PATCH] virtio_balloon: fix towards_target when deflating balloon
Both v and vb->num_pages are u32 and unsigned int respectively. If v is less
than vb->num_pages (and it is, when deflating the balloon), the result is a
very large 32-bit number. Since we're returning a s64, instead of getting the
same negative number we desire, we get a very large positive number.
This handles the case where v < vb->num_pages and ensures we get a small,
negative, s64 as the result.
Rusty: please push this for 2.6.27-rc4. It's probably appropriate for the
stable tree too...
2023 Aug 29
2
[PATCH] virtio_balloon: Fix endless deflation and inflation on arm64
...rs/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 5b15936a5214..625caac35264 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -386,6 +386,17 @@ static void stats_handle_request(struct virtio_balloon *vb)
virtqueue_kick(vq);
}
+static inline s64 align_pages_up(s64 diff)
+{
+ if (diff == 0)
+ return diff;
+
+ if (diff > 0)
+ return ALIGN(diff, VIRTIO_BALLOON_PAGES_PER_PAGE);
+
+ return -ALIGN(-diff, VIRTIO_BALLOON_PAGES_PER_PAGE);
+}
+
static inline s64 towards_target(struct virtio_balloon *vb)
{
s64 target;
@@ -396,7 +407,7 @@ sta...
2018 Sep 14
2
[GlobalISel][MIPS] Legality and instruction combining
...at for TypeIdx==1. Is it intentionally implemented this way?
>> b) Is the plan to sometimes let s1 as legal type and ignore it later?
> I'm not sure what you mean here
>
For example lets look at AArch64 G_SELECT:
getActionDefinitionsBuilder(G_SELECT)
.legalFor({{s32, s1}, {s64, s1}, {p0, s1}})
.clampScalar(0, s32, s64)
.widenScalarToNextPow2(0);
In this case LLT of operand 1 (s1) in G_SELECT has size 1, and
corresponding register class in selected instruction has size 32 (that
is $src1 in AArch64::ANDSWri, it has GPR32 regsiter class).
For that reason s1...
2023 Aug 30
1
[PATCH] virtio_balloon: Fix endless deflation and inflation on arm64
.../virtio_balloon.c
> index 5b15936a5214..625caac35264 100644
> --- a/drivers/virtio/virtio_balloon.c
> +++ b/drivers/virtio/virtio_balloon.c
> @@ -386,6 +386,17 @@ static void stats_handle_request(struct virtio_balloon *vb)
> virtqueue_kick(vq);
> }
>
> +static inline s64 align_pages_up(s64 diff)
> +{
> + if (diff == 0)
> + return diff;
> +
> + if (diff > 0)
> + return ALIGN(diff, VIRTIO_BALLOON_PAGES_PER_PAGE);
> +
> + return -ALIGN(-diff, VIRTIO_BALLOON_PAGES_PER_PAGE);
> +}
> +
> static inline s64 towards_target(struct virt...
2016 Apr 19
0
[PATCH v4 20/37] volt: add coefficients
On Tue, Apr 19, 2016 at 5:52 PM, Martin Peres <martin.peres at free.fr> wrote:
>> + result = ((s64)info.arg[0] * 15625) >>
>> 18;
>> + result += ((s64)info.arg[1] * volt->speedo
>> * 15625) >> 18;
>> + result += ((s64)info.arg[2] * temp *
>> 15625) >> 10;
>> +...
2014 Apr 19
4
[LLVMdev] [NVPTX] Eliminate common sub-expressions in a group of similar GEPs
...s PTX code that literally computes the pointer address
of
each GEP, wasting tons of registers. e.g., it emits the following PTX for
the
first load and similar PTX for other loads.
mov.u32 %r1, %tid.x;
mov.u32 %r2, %tid.y;
mul.wide.u32 %rl2, %r1, 128;
mov.u64 %rl3, a;
add.s64 %rl4, %rl3, %rl2;
mul.wide.u32 %rl5, %r2, 4;
add.s64 %rl6, %rl4, %rl5;
ld.shared.f32 %f1, [%rl6];
The resultant register pressure causes up to 20% slowdown on some of our
benchmarks.
To reduce register pressure, the optimization implemented in this patch
merges
the common sub...
2013 Mar 01
4
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
...ples_2E_mandelbrot_2F_square_param_0
> > )
> > {
> > .reg .pred %p<396>;
> > .reg .s16 %rc<396>;
> > .reg .s16 %rs<396>;
> > .reg .s32 %r<396>;
> > .reg .s64 %rl<396>;
> > .reg .f32 %f<396>;
> > .reg .f64 %fl<396>;
> >
> > mov.f64 %fl0, examples_2E_mandelbrot_2F_square_param_0;
> > mul.f64 %fl0, %fl0, %fl0;
> >...
2018 Nov 27
2
[RFC] Tablegen-erated GlobalISel Combine Rules
...ICombineRule<
(defs reg:$D, reg:$A),
(match (G_LOAD $t1, $D),
(G_SEXT $A, $t1)),
(apply (G_SEXTLOAD $A, $D))> {
let MatchStartsFrom = (roots $D);
};
def : GICombineRule<
(defs reg:$D, reg:$A, reg:$B, reg:$C),
(match (G_TRUNC s32:$t1, s64:$A),
(G_TRUNC s32:$t2, s64:$B),
(G_ADD $D, $t1, $t2)
(G_SEXT s64:$C, $D)),
(apply (G_ADD $D, $A, $B),
(G_SEXT_INREG $C, $D))> {
let MatchStartsFrom = (roots $D);
};
def : GICombineRule<
(defs reg:$D1, reg:$D2, reg:$...
2017 Jul 02
2
[GlobalISel] G_LOAD/G_STORE i64/f64 handling
...rm + float/double configuration (-mtriple=i386-linux-gnu -mattr=+sse2 )
load i64, i64* %p1 - illegal, require narrowScalar action
load double, double * %p1 - legal
What is the best approach to Legalize this case ? Should I mark G_LOAD/G_STORE s64 as Custom?
Regards,
Igor Breger
---------------------------------------------------------------------
Intel Israel (74) Limited
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibi...
2013 Mar 01
0
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
...brot_2F_square(
> > > .reg .b64 examples_2E_mandelbrot_2F_square_param_0
> > > )
> > > {
> > > .reg .pred %p<396>;
> > > .reg .s16 %rc<396>;
> > > .reg .s16 %rs<396>;
> > > .reg .s32 %r<396>;
> > > .reg .s64 %rl<396>;
> > > .reg .f32 %f<396>;
> > > .reg .f64 %fl<396>;
> > >
> > > mov.f64 %fl0, examples_2E_mandelbrot_2F_square_param_0;
> > > mul.f64 %fl0, %fl0, %fl0;
> > > mov.f64 func_retval0, %fl0;...
2018 Nov 30
2
[RFC] Tablegen-erated GlobalISel Combine Rules
...:$A),
>> (match (G_LOAD $t1, $D),
>> (G_SEXT $A, $t1)),
>> (apply (G_SEXTLOAD $A, $D))> {
>> let MatchStartsFrom = (roots $D);
>> };
>> def : GICombineRule<
>> (defs reg:$D, reg:$A, reg:$B, reg:$C),
>> (match (G_TRUNC s32:$t1, s64:$A),
>> (G_TRUNC s32:$t2, s64:$B),
>> (G_ADD $D, $t1, $t2)
>> (G_SEXT s64:$C, $D)),
>> (apply (G_ADD $D, $A, $B),
>> (G_SEXT_INREG $C, $D))> {
>> let MatchStartsFrom = (roots $D);
>> };
>> def : GICombineRule&l...
2013 Mar 01
0
[LLVMdev] NVPTX CUDA_ERROR_NO_BINARY_FOR_GPU
..._param_0
>> > )
>> > {
>> > .reg .pred %p<396>;
>> > .reg .s16 %rc<396>;
>> > .reg .s16 %rs<396>;
>> > .reg .s32 %r<396>;
>> > .reg .s64 %rl<396>;
>> > .reg .f32 %f<396>;
>> > .reg .f64 %fl<396>;
>> >
>> > mov.f64 %fl0, examples_2E_mandelbrot_2F_square_param_0;
>> > mul.f64 %fl0, %fl0, %fl0;...
2006 Nov 15
0
How to print out float/double arguments from arg0, arg1, ...?
I want to print out arguments of float and double type, such as from sin(),
cos(), etc. By trial and error, I came up following macros.
union {
double d64;
float f32[2];
int64_t s64;
int32_t s32[2];
} VALUE;
#define PRINT_F32_sparc(val) \
VALUE.s64 = val; \
printf("\n%s = %f\n", \
"val", VALUE.f32[1]);
#define PRINT_F32_i386(...
2017 Oct 21
2
Removing the register block in MIR
The MIR format currently has a short-hand syntax for declaring vreg
classes and banks in the function body so you can write something like
this:
name: foo
body: |
%3:gpr(s64) = ...
rather than the much more verbose and awkward:
name: foo
registers:
- { id: 3, class: gpr }
body: |
%3(s64) = ...
I'd like to make this shorthand the only way to do this. There are a few
things that need to be handled here:
- We should only print the class on defs, not...