search for: mcgari

Displaying 17 results from an estimated 17 matches for "mcgari".

Did you mean: mcgary
2009 Apr 20
4
[LLVMdev] Unnecessary moves after sign-extension in 2-address target
My two-address target machine has sign-extension instructions to extend i8->i32 and i16->i32. When I compile this simple program: int sext (unsigned a, unsigned b, int c) { return (signed char) a + (signed short) b + c; } I get this IR: define i32 @sext(i32 %a, i32 %b, i32 %c) nounwind readnone { entry: %conv = trunc i32 %a to i8 ; <i8>
2009 Apr 21
3
[LLVMdev] Unnecessary moves after sign-extension in 2-address target
Dan Gohman wrote: > On Apr 19, 2009, at 6:15 PM, Greg McGary wrote: > >> Because sextb_r and sextw_r have destination tied to source operands, >> TwoAddressInstructionPass thinks it needs a copy. However, since the >> sext kills its source, the copy is unnecessary. Why does this happen? >> Is TwoAddressInstructionPass relying on a later pass to notice this
2009 Apr 16
3
[LLVMdev] Help me improve two-address code
Evan Cheng wrote: > On Apr 16, 2009, at 3:17 PM, Greg McGary wrote: > >> Is there some optimizer knob I'm not turning properly? In more complex >> cases, GCC does poorly with two-address operand choices and so bloats >> the code with unnecessary register moves. I have high hopes LLVM >> can do better, so this result for a simple case is bothersome. >>
2009 Apr 16
0
[LLVMdev] Help me improve two-address code
On Apr 16, 2009, at 3:17 PM, Greg McGary wrote: > I have my new port limping enough to compile a very basic function: > > int > foo (int a, int b, int c, int d) > { > return a + b - c + d; > } > > clang-cc -O2 yields: > > define i32 @foo(i32 %a, i32 %b, i32 %c, i32 %d) nounwind readnone { > entry: > %add = add i32 %b, %a ; <i32> [#uses=1]
2009 Apr 17
0
[LLVMdev] Help me improve two-address code
On Apr 16, 2009, at 4:25 PM, Greg McGary wrote: > Evan Cheng wrote: >> On Apr 16, 2009, at 3:17 PM, Greg McGary wrote: >> >>> Is there some optimizer knob I'm not turning properly? In more >>> complex >>> cases, GCC does poorly with two-address operand choices and so >>> bloats >>> the code with unnecessary register moves. I
2009 Apr 21
0
[LLVMdev] Unnecessary moves after sign-extension in 2-address target
Greg McGary wrote: > ********** REWRITING TWO-ADDR INSTRS ********** > ********** Function: sext > %reg1028<def> = sextb_r %reg1025<kill> > prepend: %reg1028<def> = mov_rr %reg1025<kill> > rewrite to: %reg1028<def> = sextb_r %reg1028 > ... > %reg1030<def> = sextw_r %reg1026<kill> > prepend:
2009 Apr 21
0
[LLVMdev] Unnecessary moves after sign-extension in 2-address target
On Apr 19, 2009, at 6:15 PM, Greg McGary wrote: > > Because sextb_r and sextw_r have destination tied to source operands, > TwoAddressInstructionPass thinks it needs a copy. However, since the > sext kills its source, the copy is unnecessary. Why does this happen? > Is TwoAddressInstructionPass relying on a later pass to notice this > and > transform it again? Yes, the
2009 Apr 22
0
[LLVMdev] Unnecessary moves after sign-extension in 2-address target
On Apr 21, 2009, at 4:02 PM, Greg McGary wrote: > Dan Gohman wrote: >> On Apr 19, 2009, at 6:15 PM, Greg McGary wrote: >> >>> Because sextb_r and sextw_r have destination tied to source >>> operands, >>> TwoAddressInstructionPass thinks it needs a copy. However, since >>> the >>> sext kills its source, the copy is unnecessary. Why
2009 Apr 17
0
[LLVMdev] How do I model MUL with multiply-accumulate instruction?
On Apr 16, 2009, at 2:19 PM, Greg McGary wrote: > The only multiplication instruction on my target CPU is > multiply-and-accumulate. The result goes into a special register that > can destructively read at the end of a sequence of multiply-adds. The > following sequence is required to so a simple multiply: > > acc r0 # clear accumulator, discarding its value (r0 reads as
2009 Apr 16
2
[LLVMdev] Help me improve two-address code
I have my new port limping enough to compile a very basic function: int foo (int a, int b, int c, int d) { return a + b - c + d; } clang-cc -O2 yields: define i32 @foo(i32 %a, i32 %b, i32 %c, i32 %d) nounwind readnone { entry: %add = add i32 %b, %a ; <i32> [#uses=1] %sub = sub i32 %add, %c ; <i32> [#uses=1] %add4 = add i32 %sub, %d ; <i32>
2009 Apr 16
2
[LLVMdev] How do I model MUL with multiply-accumulate instruction?
The only multiplication instruction on my target CPU is multiply-and-accumulate. The result goes into a special register that can destructively read at the end of a sequence of multiply-adds. The following sequence is required to so a simple multiply: acc r0 # clear accumulator, discarding its value (r0 reads as 0, and sinks writes) mac rSRC1, rSRC2 # multiply sources, store
2009 Apr 12
9
[LLVMdev] Porting LLVM backend is no fun yet
As we've already seen, David Chisnall prefers hacking LLVM over GCC (see http://www.informit.com/articles/article.aspx?p=1215438): "In contrast, every time I look at the GCC code, it takes two people to prevent me from clawing my eyeballs out." I'm sorry to report that so-far I have had the opposite experience. Some years ago, I ported binutils (via CGEN) and GCC to an
2009 Apr 13
0
[LLVMdev] Porting LLVM backend is no fun yet
Hi Greg, I understand your frustration. I've been on this mailing list for a little over a year hoping that by osmosis I could get a a better handle on writing a back end for LLVM. Although I feel more comfortable with the nomenclature, I still do not have a clue as to how to begin (actually I do, but it sounds more dramatic saying it this way). I've read the documentation, but
2009 Apr 20
1
[LLVMdev] How to prevent LLVM from undoing a custom lowering
My target has only logical shifts and lacks an arithmetic right shift instruction. I have a custom LowerSRA function that rewrites SRA as SHL + SIGN_EXTEND when the shift width is either constant 16 or 24. Unfortunately, I observe that a later pass combines the SHL + SIGN_EXTEND back into SRA so we crash. The idea I had for defeating this behavior is lower to a target-specific version of SHL
2009 Apr 10
2
[LLVMdev] cross llvm
I have some broad newbie questions about LLVM and its language front-ends with regard to cross targeting: I assume LLVM IR and bitcode are machine independent, yet bitcode files encode an arch triple. Why? Is it just a hint for subsequent lowering phases, or it it a recommended target? Does IR/bitcode produced by a front-end configured for ARM differ from bitcode for, say PowerPC or x86?
2009 Apr 13
0
[LLVMdev] Porting LLVM backend is no fun yet
On Apr 11, 2009, at 5:03 PM, Greg McGary wrote: > As we've already seen, David Chisnall prefers hacking LLVM over GCC > (see http://www.informit.com/articles/article.aspx?p=1215438): "In > contrast, every time I look at the GCC code, it takes two people to > prevent me from clawing my eyeballs out." > > I'm sorry to report that so-far I have had the
2009 Apr 13
0
[LLVMdev] Porting LLVM backend is no fun yet
On Apr 11, 2009, at 5:03 PM, Greg McGary wrote: > As we've already seen, David Chisnall prefers hacking LLVM over GCC > (see http://www.informit.com/articles/article.aspx?p=1215438): "In > contrast, every time I look at the GCC code, it takes two people to > prevent me from clawing my eyeballs out." > > I'm sorry to report that so-far I have had the