search for: idiv

Displaying 20 results from an estimated 33 matches for "idiv".

Did you mean: div
2014 Jan 11
3
[LLVMdev] Possible error in docs.
http://llvm.org/docs/CodeGenerator.html#machine-code-description-classes Section starting: Fixed (preassigned) registers It talks about converting: define i32 @test(i32 %X, i32 %Y) { %Z = udiv i32 %X, %Y ret i32 %Z } into ;; X is in EAX, Y is in ECX mov %EAX, %EDX sar %EDX, 31 idiv %ECX ret BUT, where does the "sar" come from? Kind Regards James
2012 Mar 27
1
[LLVMdev] Compiling integer mod
For the simple C program below I show the output of clang and the output of the VS compiler (I am on windows). Maybe this is obvious to you, but is it really faster to do 2 multiplications, 3 movl instructions, 2 shifts, 1 add, and 1 substract than to do 1 mov, 1 cdq, and 1 idiv? I run into this while trying to understand why my code runs slower with llvm than a comparable program on windows. Thanks for any help, Brent int f(int n) { return (n + 1) % 18; } "clang -O2 -S" produces this code: _f: # @f # BB#0: movl 4(%es...
2012 Jun 18
2
[LLVMdev] Best way to replace LLVM IR operation with code containing control flow?
Hi, -Does anyone know where a backend-specific optimization can be added to replace an instruction with code containing control flow? I'm interested in adding an optimization for the DIV instruction (x86-atom) which replace the IDIV/DIV with code containing control flow to select between the intended IDIV/DIV and an 8-bit DIV with movzx, as described in the Intel Atom Optimization Guide. My first attempt was to add this with a custom inserter in X86ISelLowering (see EmitInstrWithCustomInserter). However, the isel already done...
2018 Mar 01
0
[parallel] fixes load balancing of parLapplyLB
...gt;>> load-balancing you propose in (C).?? (*) Different definition from the >>> above 'scale'. (Disclaimer: I'm the author) >>> >>> /Henrik >>> >>> On Mon, Feb 19, 2018 at 10:21 AM, Christian Krause >>> <christian.krause at idiv.de> wrote: >>>> Dear R-Devel List, >>>> >>>> I have installed R 3.4.3 with the patch applied on our cluster and >>> ran a *real-world* job of one of our users to confirm that the patch >>> works to my satisfaction. Here are the results. >&...
2018 Feb 12
2
[parallel] fixes load balancing of parLapplyLB
..., and then load balancing should work. Best Regards -- Christian Krause Scientific Computing Administration and Support ------------------------------------------------------------------------------------------------------------------------ Phone: +49 341 97 33144 Email: christian.krause at idiv.de ------------------------------------------------------------------------------------------------------------------------ German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig Deutscher Platz 5e 04103 Leipzig Germany --------------------------------------------------...
2018 Feb 26
2
[parallel] fixes load balancing of parLapplyLB
...= 4 achieves the amount of >> load-balancing you propose in (C). (*) Different definition from the >> above 'scale'. (Disclaimer: I'm the author) >> >> /Henrik >> >> On Mon, Feb 19, 2018 at 10:21 AM, Christian Krause >> <christian.krause at idiv.de> wrote: >>> Dear R-Devel List, >>> >>> I have installed R 3.4.3 with the patch applied on our cluster and >> ran a *real-world* job of one of our users to confirm that the patch >> works to my satisfaction. Here are the results. >>> The original...
2016 Nov 24
1
[parallel-package] feature request: set default cluster type via environment variable
...------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Phone: +49 341 97 33144 Email: christian.krause at idiv.de -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------...
2018 Feb 19
2
[parallel] fixes load balancing of parLapplyLB
...nd future.scheduling = +Inf to (A). Using future.scheduling = 4 achieves the amount of load-balancing you propose in (C). (*) Different definition from the above 'scale'. (Disclaimer: I'm the author) /Henrik On Mon, Feb 19, 2018 at 10:21 AM, Christian Krause <christian.krause at idiv.de> wrote: > Dear R-Devel List, > > I have installed R 3.4.3 with the patch applied on our cluster and ran a *real-world* job of one of our users to confirm that the patch works to my satisfaction. Here are the results. > > The original was a series of jobs, all essentially doing...
2019 Jun 15
3
Constrained integer DIV (WAS: Re: Planned change to IR semantics: constant expressions never have undefined behavior)
...n keep it local. Do you really need a special way to represent this in IR? If you need a SIGFPE, the frontend can generate the equivalent of "if (x==0) raise(SIGFPE);", which is well-defined across all targets. If you really need a "real" trap, I guess we could expose the x86 idiv as an intrinsic. -Eli
2013 Apr 05
0
[LLVMdev] Integer divide by zero
...it will always trap and has no result of any type. Per C and C++, integer division by 0 is undefined. That means, if it happens, the compiler is free to do whatever it wants. It is perfectly legal for LLVM to define r to be, say, 42 in this code; it is not required to preserve the fact that the idiv instruction on x86 and x86-64 will trap. -- Joshua Cranmer Thunderbird and DXR developer Source code archæologist
2013 Apr 05
3
[LLVMdev] Integer divide by zero
Hey guys, I'm learning that LLVM does not preserve faults during constant folding. I realize that this is an architecture dependent problem, but I'm not sure if it's safe to constant fold away a fault on x86-64. A little testcase: #include <stdio.h> int foo(int j, int d) { return j / d ; } int bar (int k, int d) { return foo(k + 1, d); } int main( void ) { int r =
2018 Feb 20
0
[parallel] fixes load balancing of parLapplyLB
...to (A). Using future.scheduling = 4 achieves the amount of >load-balancing you propose in (C). (*) Different definition from the >above 'scale'. (Disclaimer: I'm the author) > >/Henrik > >On Mon, Feb 19, 2018 at 10:21 AM, Christian Krause ><christian.krause at idiv.de> wrote: >> Dear R-Devel List, >> >> I have installed R 3.4.3 with the patch applied on our cluster and >ran a *real-world* job of one of our users to confirm that the patch >works to my satisfaction. Here are the results. >> >> The original was a series of...
2018 Feb 19
0
[parallel] fixes load balancing of parLapplyLB
...------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Email: christian.krause at idiv.de Office: BioCity Leipzig 5e, Room 3.201.3 Phone: +49 341 97 33144 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------...
2012 May 09
2
[LLVMdev] instructions requiring specific physical registers for operands
Hi, I have som instructions that require the operand to be placed in exactly one physical register, and thus I have introduced a Just_a0 register class. I have found that the register allocators / coalescer do not seem to care about this. In many cases they "run out of registers during register allocation". I have managed to avoid some problems, by inserting target move instructions in
2012 May 09
0
[LLVMdev] instructions requiring specific physical registers for operands
Hello Jonas, > I wonder, what would be the best solution for instructions that require > operands in a particular register, and even gives the result in a particular > register? You need to custom select such instruction. See e.g. div / idiv on x86 as an example. -- With best regards, Anton Korobeynikov Faculty of Mathematics and Mechanics, Saint Petersburg State University
2012 Jun 19
0
[LLVMdev] Best way to replace LLVM IR operation with code containing control flow?
Hi Tyler, > -Does anyone know where a backend-specific optimization can be added to replace > an instruction with code containing control flow? I think the backend lowering of atomic intrinsics generates control flow (loops), so that may give you a clue. Ciao, Duncan.
2013 Apr 05
4
[LLVMdev] Integer divide by zero
...geot18 at gmail.com>wrote: ... > Per C and C++, integer division by 0 is undefined. That means, if it > happens, the compiler is free to do whatever it wants. It is perfectly > legal for LLVM to define r to be, say, 42 in this code; it is not required > to preserve the fact that the idiv instruction on x86 and x86-64 will trap. This is quite a conundrum to me. Yes, I agree with you on the C/C++ Standards interpretation. However, the x86-64 expectations are orthogonal. I find that other compilers, including GCC, will trap by default at high optimization levels on x86-64 for this t...
2015 Jan 29
4
[LLVMdev] CPUStringIsValid() into MCSubtargetInfo and use it for ARM .cpu parsing
Tim, How about the below option ? 1. Specify an existing generic armv7 CPU or the CPU which is close my custom variant. My custom variant can be treated as "cortex-a9" + hwdiv. So my CPU here is "cortex-a9" 2. Specify the ".arch_extension idiv" which is available as an extension for my custom variant. 3. Teach LLVM & Clang about your CPU's features, either locally or upstream. 4. Pass "-mhwdiv=arm,thumb" to Clang (or less if you only have hwdiv in one mode). --Sumanth G -----Original Message----- From: Tim North...
2018 Dec 03
3
The builtins library of compiler-rt is a performance HOG^WKILLER
...ntegers giving a 64-bit product! I expect that a library written 20+ years later takes advantage of these instructions! __divsi3 (18 instructions) perform a DIV after 2 calls of abs(), plus a final negation, instead of just a single IDIV __modsi3 (14 instructions) calls __divsi3 (18 instructions) __divmodsi4 (17 instructions) calls __divsi3 (18 instructions) __udivsi3 (52 instructions) does NOT use DIV, but performs BITWISE division using shifts and additions! __umodsi3 (14 instructions) calls __udivsi3...
2018 Dec 03
3
The builtins library of compiler-rt is a performance HOG^WKILLER
...ect that a library written 20+ years later takes >> advantage of these instructions! >> >> __divsi3 (18 instructions) perform a DIV after 2 calls of abs(), >> plus a final negation, instead of just >> a single IDIV >> __modsi3 (14 instructions) calls __divsi3 (18 instructions) >> __divmodsi4 (17 instructions) calls __divsi3 (18 instructions) >> >> __udivsi3 (52 instructions) does NOT use DIV, but performs BITWISE >> division using shifts and additions!...