Displaying 20 results from an estimated 33 matches for "idiv".
Did you mean:
div
2014 Jan 11
3
[LLVMdev] Possible error in docs.
http://llvm.org/docs/CodeGenerator.html#machine-code-description-classes
Section starting:
Fixed (preassigned) registers
It talks about converting:
define i32 @test(i32 %X, i32 %Y) {
%Z = udiv i32 %X, %Y
ret i32 %Z
}
into
;; X is in EAX, Y is in ECX
mov %EAX, %EDX
sar %EDX, 31
idiv %ECX
ret
BUT, where does the "sar" come from?
Kind Regards
James
2012 Mar 27
1
[LLVMdev] Compiling integer mod
For the simple C program below I show the output of clang and the
output of the VS compiler (I am on windows). Maybe this is obvious to
you, but is it really faster to do 2 multiplications, 3 movl
instructions, 2 shifts, 1 add, and 1 substract than to do 1 mov, 1
cdq, and 1 idiv?
I run into this while trying to understand why my code runs slower
with llvm than a comparable program on windows.
Thanks for any help,
Brent
int f(int n)
{
return (n + 1) % 18;
}
"clang -O2 -S" produces this code:
_f: # @f
# BB#0:
movl 4(%es...
2012 Jun 18
2
[LLVMdev] Best way to replace LLVM IR operation with code containing control flow?
Hi,
-Does anyone know where a backend-specific optimization can be added to replace an instruction with code containing control flow?
I'm interested in adding an optimization for the DIV instruction (x86-atom) which replace the IDIV/DIV with code containing control flow to select between the intended IDIV/DIV and an 8-bit DIV with movzx, as described in the Intel Atom Optimization Guide. My first attempt was to add this with a custom inserter in X86ISelLowering (see EmitInstrWithCustomInserter). However, the isel already done...
2018 Mar 01
0
[parallel] fixes load balancing of parLapplyLB
...gt;>> load-balancing you propose in (C).?? (*) Different definition from the
>>> above 'scale'. (Disclaimer: I'm the author)
>>>
>>> /Henrik
>>>
>>> On Mon, Feb 19, 2018 at 10:21 AM, Christian Krause
>>> <christian.krause at idiv.de> wrote:
>>>> Dear R-Devel List,
>>>>
>>>> I have installed R 3.4.3 with the patch applied on our cluster and
>>> ran a *real-world* job of one of our users to confirm that the patch
>>> works to my satisfaction. Here are the results.
>&...
2018 Feb 12
2
[parallel] fixes load balancing of parLapplyLB
..., and then load balancing
should work.
Best Regards
--
Christian Krause
Scientific Computing Administration and Support
------------------------------------------------------------------------------------------------------------------------
Phone: +49 341 97 33144
Email: christian.krause at idiv.de
------------------------------------------------------------------------------------------------------------------------
German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig
Deutscher Platz 5e
04103 Leipzig
Germany
--------------------------------------------------...
2018 Feb 26
2
[parallel] fixes load balancing of parLapplyLB
...= 4 achieves the amount of
>> load-balancing you propose in (C). (*) Different definition from the
>> above 'scale'. (Disclaimer: I'm the author)
>>
>> /Henrik
>>
>> On Mon, Feb 19, 2018 at 10:21 AM, Christian Krause
>> <christian.krause at idiv.de> wrote:
>>> Dear R-Devel List,
>>>
>>> I have installed R 3.4.3 with the patch applied on our cluster and
>> ran a *real-world* job of one of our users to confirm that the patch
>> works to my satisfaction. Here are the results.
>>> The original...
2016 Nov 24
1
[parallel-package] feature request: set default cluster type via environment variable
...------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Phone: +49 341 97 33144
Email: christian.krause at idiv.de
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------...
2018 Feb 19
2
[parallel] fixes load balancing of parLapplyLB
...nd future.scheduling = +Inf
to (A). Using future.scheduling = 4 achieves the amount of
load-balancing you propose in (C). (*) Different definition from the
above 'scale'. (Disclaimer: I'm the author)
/Henrik
On Mon, Feb 19, 2018 at 10:21 AM, Christian Krause
<christian.krause at idiv.de> wrote:
> Dear R-Devel List,
>
> I have installed R 3.4.3 with the patch applied on our cluster and ran a *real-world* job of one of our users to confirm that the patch works to my satisfaction. Here are the results.
>
> The original was a series of jobs, all essentially doing...
2019 Jun 15
3
Constrained integer DIV (WAS: Re: Planned change to IR semantics: constant expressions never have undefined behavior)
...n keep it local.
Do you really need a special way to represent this in IR? If you need a SIGFPE, the frontend can generate the equivalent of "if (x==0) raise(SIGFPE);", which is well-defined across all targets. If you really need a "real" trap, I guess we could expose the x86 idiv as an intrinsic.
-Eli
2013 Apr 05
0
[LLVMdev] Integer divide by zero
...it will always trap and has no result of any type.
Per C and C++, integer division by 0 is undefined. That means, if it
happens, the compiler is free to do whatever it wants. It is perfectly
legal for LLVM to define r to be, say, 42 in this code; it is not
required to preserve the fact that the idiv instruction on x86 and
x86-64 will trap.
--
Joshua Cranmer
Thunderbird and DXR developer
Source code archæologist
2013 Apr 05
3
[LLVMdev] Integer divide by zero
Hey guys,
I'm learning that LLVM does not preserve faults during constant folding. I
realize that this is an architecture dependent problem, but I'm not sure if
it's safe to constant fold away a fault on x86-64.
A little testcase:
#include <stdio.h>
int foo(int j, int d) {
return j / d ;
}
int bar (int k, int d) {
return foo(k + 1, d);
}
int main( void ) {
int r =
2018 Feb 20
0
[parallel] fixes load balancing of parLapplyLB
...to (A). Using future.scheduling = 4 achieves the amount of
>load-balancing you propose in (C). (*) Different definition from the
>above 'scale'. (Disclaimer: I'm the author)
>
>/Henrik
>
>On Mon, Feb 19, 2018 at 10:21 AM, Christian Krause
><christian.krause at idiv.de> wrote:
>> Dear R-Devel List,
>>
>> I have installed R 3.4.3 with the patch applied on our cluster and
>ran a *real-world* job of one of our users to confirm that the patch
>works to my satisfaction. Here are the results.
>>
>> The original was a series of...
2018 Feb 19
0
[parallel] fixes load balancing of parLapplyLB
...-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Email: christian.krause at idiv.de
Office: BioCity Leipzig 5e, Room 3.201.3
Phone: +49 341 97 33144
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------...
2012 May 09
2
[LLVMdev] instructions requiring specific physical registers for operands
Hi,
I have som instructions that require the operand to be placed in exactly one physical register, and thus I have introduced a Just_a0 register class.
I have found that the register allocators / coalescer do not seem to care about this. In many cases they "run out of registers during register allocation". I have managed to avoid some problems, by inserting target move instructions in
2012 May 09
0
[LLVMdev] instructions requiring specific physical registers for operands
Hello Jonas,
> I wonder, what would be the best solution for instructions that require
> operands in a particular register, and even gives the result in a particular
> register?
You need to custom select such instruction. See e.g. div / idiv on x86
as an example.
--
With best regards, Anton Korobeynikov
Faculty of Mathematics and Mechanics, Saint Petersburg State University
2012 Jun 19
0
[LLVMdev] Best way to replace LLVM IR operation with code containing control flow?
Hi Tyler,
> -Does anyone know where a backend-specific optimization can be added to replace
> an instruction with code containing control flow?
I think the backend lowering of atomic intrinsics generates control flow
(loops), so that may give you a clue.
Ciao, Duncan.
2013 Apr 05
4
[LLVMdev] Integer divide by zero
...geot18 at gmail.com>wrote:
...
> Per C and C++, integer division by 0 is undefined. That means, if it
> happens, the compiler is free to do whatever it wants. It is perfectly
> legal for LLVM to define r to be, say, 42 in this code; it is not required
> to preserve the fact that the idiv instruction on x86 and x86-64 will trap.
This is quite a conundrum to me. Yes, I agree with you on the C/C++
Standards interpretation. However, the x86-64 expectations are orthogonal.
I find that other compilers, including GCC, will trap by default at high
optimization levels on x86-64 for this t...
2015 Jan 29
4
[LLVMdev] CPUStringIsValid() into MCSubtargetInfo and use it for ARM .cpu parsing
Tim,
How about the below option ?
1. Specify an existing generic armv7 CPU or the CPU which is close my custom variant. My custom variant can be treated as "cortex-a9" + hwdiv.
So my CPU here is "cortex-a9"
2. Specify the ".arch_extension idiv" which is available as an extension for my custom variant.
3. Teach LLVM & Clang about your CPU's features, either locally or upstream.
4. Pass "-mhwdiv=arm,thumb" to Clang (or less if you only have hwdiv in one mode).
--Sumanth G
-----Original Message-----
From: Tim North...
2018 Dec 03
3
The builtins library of compiler-rt is a performance HOG^WKILLER
...ntegers giving a 64-bit product!
I expect that a library written 20+ years later takes
advantage of these instructions!
__divsi3 (18 instructions) perform a DIV after 2 calls of abs(),
plus a final negation, instead of just
a single IDIV
__modsi3 (14 instructions) calls __divsi3 (18 instructions)
__divmodsi4 (17 instructions) calls __divsi3 (18 instructions)
__udivsi3 (52 instructions) does NOT use DIV, but performs BITWISE
division using shifts and additions!
__umodsi3 (14 instructions) calls __udivsi3...
2018 Dec 03
3
The builtins library of compiler-rt is a performance HOG^WKILLER
...ect that a library written 20+ years later takes
>> advantage of these instructions!
>>
>> __divsi3 (18 instructions) perform a DIV after 2 calls of abs(),
>> plus a final negation, instead of just
>> a single IDIV
>> __modsi3 (14 instructions) calls __divsi3 (18 instructions)
>> __divmodsi4 (17 instructions) calls __divsi3 (18 instructions)
>>
>> __udivsi3 (52 instructions) does NOT use DIV, but performs BITWISE
>> division using shifts and additions!...