Displaying 20 results from an estimated 131 matches for "andling".
Did you mean:
handling
2014 Jan 18
2
[LLVMdev] Scheduling quirks
Hello all!
When I compile the following more or less stupid functions with
clang++ -O3 -S test.cpp
===>
int test_register(int x) {
x ^= (x >> 2);
x ^= (x >> 3);
x = x ^ (x >> 4);
int y = x; x >>= 5; x ^= y; // almost the same but explicit
return x;
}
int test_scheduler(int x) {
return ((x>>2) & 15) ^ ((x>>3) & 31);
}
2008 Mar 26
2
[LLVMdev] Checked arithmetic
Hi Chris,
> Why not define an "add with overflow" intrinsic that returns its value and
> overflow bit as an i1?
what's the point? We have this today with apint codegen (if you turn on
LegalizeTypes). For example, this function
define i1 @cc(i32 %x, i32 %y) {
%xx = zext i32 %x to i33
%yy = zext i32 %y to i33
%s = add i33 %xx, %yy
%tmp = lshr i33 %s, 32
%b = trunc
2016 Feb 11
3
Expected constant simplification not happening
Hi
the appended IR code does not optimize to my liking :)
this is the interesting part in x86_64, that got produced via clang -Os:
---
movq -16(%r12), %rax
movl -4(%rax), %ecx
andl $2298949, %ecx ## imm = 0x231445
cmpq $2298949, (%rax,%rcx) ## imm = 0x231445
leaq 8(%rax,%rcx), %rax
cmovneq %r15, %rax
movl $2298949, %esi ## imm = 0x231445
movq %r12, %rdi
movq %r14,
2007 Apr 18
2
[patch 3/8] Allow a kernel to not be in ring 0.
In-Reply-To: <20060803002518.190834642@xensource.com>
On Wed, 02 Aug 2006 17:25:13 -0700, Jeremy Fitzhardinge wrote:
> We allow for the fact that the guest kernel may not run in ring 0.
> This requires some abstraction in a few places when setting %cs or
> checking privilege level (user vs kernel).
I made some changes:
a. Added some comments about the SEGMENT_IS_*_CODE() macros.
2007 Apr 18
2
[patch 3/8] Allow a kernel to not be in ring 0.
In-Reply-To: <20060803002518.190834642@xensource.com>
On Wed, 02 Aug 2006 17:25:13 -0700, Jeremy Fitzhardinge wrote:
> We allow for the fact that the guest kernel may not run in ring 0.
> This requires some abstraction in a few places when setting %cs or
> checking privilege level (user vs kernel).
I made some changes:
a. Added some comments about the SEGMENT_IS_*_CODE() macros.
2010 Oct 07
2
[LLVMdev] [Q] x86 peephole deficiency
Hi all,
I am slowly working on a SwitchInst optimizer (http://llvm.org/PR8125)
and now I am running into a deficiency of the x86
peephole optimizer (or jump-threader?). Here is what I get:
andl $3, %edi
je .LBB0_4
# BB#2: # %nz
# in Loop: Header=BB0_1
Depth=1
cmpl $2, %edi
2014 Aug 08
4
[LLVMdev] Efficient Pattern matching in Instruction Combine
Hi Duncan, David, Sean.
Thanks for your reply.
> It'd be interesting if you could find a design that also treated these
> the same:
>
> (B ^ A) | ((A ^ B) ^ C) -> (A ^ B) | C
> (B ^ A) | ((B ^ C) ^ A) -> (A ^ B) | C
> (B ^ A) | ((C ^ A) ^ B) -> (A ^ B) | C
>
> I.e., `^` is also associative.
Agree with Duncan on including associative operation too.
2016 Dec 07
1
Expected constant simplification not happening
Hello
Has there been any progress on this topic ? The 3.9 optimizer output is
still the same as I just looked.
https://llvm.org/bugs/show_bug.cgi?id=24448
Ciao
Nat!
Sanjay Patel schrieb:
> [cc'ing Zia]
>
> We have this transform with -Os for some cases after:
> http://reviews.llvm.org/rL244601
> http://reviews.llvm.org/D11363
>
> but something in this example is
2017 Sep 25
0
What should a truncating store do?
(Not sure if this exactly maps to “truncating store”, but I think it at least touches some of the subjects discussed in this thread)
Our out-of-tree-target need several patches to get things working correctly for us.
We have introduced i24 and i40 types in ValueTypes/MachineValueTypes (in addition to the normal pow-of-2 types). And we have vectors of those (v2i40, v4i40).
And the byte size in our
2007 Apr 18
1
[PATCH] Slight cleanups for x86 ring macros (against rc3-mm2)
Clean up of patch for letting kernel run other than ring 0:
a. Add some comments about the SEGMENT_IS_*_CODE() macros.
b. Add a USER_RPL macro. (Code was comparing a value to a mask
in some places and to the magic number 3 in other places.)
c. Add macros for table indicator field and use them.
d. Change the entry.S tests for LDT stack segment to use the macros.
Signed-off-by: Chuck Ebbert
2007 Apr 18
1
[PATCH] Slight cleanups for x86 ring macros (against rc3-mm2)
Clean up of patch for letting kernel run other than ring 0:
a. Add some comments about the SEGMENT_IS_*_CODE() macros.
b. Add a USER_RPL macro. (Code was comparing a value to a mask
in some places and to the magic number 3 in other places.)
c. Add macros for table indicator field and use them.
d. Change the entry.S tests for LDT stack segment to use the macros.
Signed-off-by: Chuck Ebbert
2017 Sep 25
3
What should a truncating store do?
On 9/25/2017 9:14 AM, Björn Pettersson A wrote:
>
> (Not sure if this exactly maps to “truncating store”, but I think it
> at least touches some of the subjects discussed in this thread)
>
> Our out-of-tree-target need several patches to get things working
> correctly for us.
>
> We have introduced i24 and i40 types in ValueTypes/MachineValueTypes
> (in addition to
2014 Aug 13
2
[LLVMdev] Efficient Pattern matching in Instruction Combine
Thanks Sean for the reference.
I will go through it and see if i can implement it for generic boolean
expression minimization.
Regards,
Suyog
On Wed, Aug 13, 2014 at 2:30 AM, Sean Silva <chisophugis at gmail.com> wrote:
> Re-adding the mailing list (remember to hit "reply all")
>
>
> On Tue, Aug 12, 2014 at 9:36 AM, suyog sarda <sardask01 at gmail.com> wrote:
2010 Oct 07
0
[LLVMdev] [Q] x86 peephole deficiency
On Oct 6, 2010, at 6:16 PM, Gabor Greif wrote:
> Hi all,
>
> I am slowly working on a SwitchInst optimizer (http://llvm.org/PR8125)
> and now I am running into a deficiency of the x86
> peephole optimizer (or jump-threader?). Here is what I get:
>
>
> andl $3, %edi
> je .LBB0_4
> # BB#2: # %nz
>
2007 Apr 18
0
[PATCH 17/21] i386 Ldt cleanups 1
Big cleanup of LDT code. This code has very little type checking and is
not frequently used, so I audited the code, added type checking and size
optimizations to generate smaller assembly code.
First, just introduce some small definitions that will be used later.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Index: linux-2.6.14-zach-work/arch/i386/kernel/entry.S
2011 May 20
1
No logging
Dear folks,
strace shows that snmp-ups driver writes to stderr:
[pid 21825] write(2, "No log handling enabled - turnin"..., 52) = 52
| 00000 4e 6f 20 6c 6f 67 20 68 61 6e 64 6c 69 6e 67 20 No log h andling |
| 00010 65 6e 61 62 6c 65 64 20 2d 20 74 75 72 6e 69 6e enabled - turnin |
| 00020 67 20 6f 6e 20 73 74 64 65 72 72 20 6c 6f 67 67 g on std err logg |
| 00030 69 6e 67 0a...
2007 Apr 18
0
[PATCH 17/21] i386 Ldt cleanups 1
Big cleanup of LDT code. This code has very little type checking and is
not frequently used, so I audited the code, added type checking and size
optimizations to generate smaller assembly code.
First, just introduce some small definitions that will be used later.
Signed-off-by: Zachary Amsden <zach@vmware.com>
Index: linux-2.6.14-zach-work/arch/i386/kernel/entry.S
2008 Mar 26
0
[LLVMdev] Checked arithmetic
On Wed, 26 Mar 2008, Jonathan S. Shapiro wrote:
> I want to background process this for a bit, but it would be helpful to
> discuss some approaches first.
>
> There would appear to be three approaches:
>
> 1. Introduce a CC register class into the IR. This seems to be a
> fairly major overhaul.
>
> 2. Introduce a set of scalar and fp computation quasi-instructions
2008 Mar 26
0
[LLVMdev] Checked arithmetic
On Wed, 26 Mar 2008, Duncan Sands wrote:
> Hi Chris,
>
>> Why not define an "add with overflow" intrinsic that returns its value and
>> overflow bit as an i1?
>
> what's the point? We have this today with apint codegen (if you turn on
> LegalizeTypes). For example, this function
The desired code is something like:
foo:
addl %eax, %ecx
jo
2017 Aug 02
3
[InstCombine] Simplification sometimes only transforms but doesn't simplify instruction, causing side effect in other pass
...struction there. In addition, replace
%vreg0 with %vreg1 may increase an extra move before "AND32ri8
%vreg0<tied0>, 31", so we still need to check "AND32ri8 %vreg0<tied0>,
31" is colder than "AND32ri %vreg0<tied0>, 1792".
All these efforts are just handling a specific pattern, and if the
pattern changes a little bit, they won't work.
BB_i:
%vreg9<def> = COPY %vreg0:sub_8bit; GR8:%vreg9 GR32:%vreg0
%vreg1<def> = MOVZX32rr8 %vreg9<kill>; GR32:%vreg1 GR8:%vreg9
%vreg10<def,tied1> = AND32ri %vreg0<tie...