thr3ads.net - llvm dev - [LLVMdev] Testing LLVM on OS X [May 2004]

If this information is useful, please help other people find it:
Share via:

Patrick Flanagan

2004-May-04 20:48 UTC

[LLVMdev] Testing LLVM on OS X

I was able to run through all the C/C++ benchmarks in SPEC using LLVM. 
I'm on OS X 10.3.3. I did a quick comparison between LLVM (latest from 
CVS as of 4/27) and gcc 3.3 (Apple's build 20030304).  For simplicity's 
sake, the only flag I used was -O3 for each compiler and I was using 
the C backend to generate native code for PPC.

Most of the LLVM results were close to gcc performance (within 5%), but 
a few of the tests caught my eye. 164.gzip ran about 25% slower on my 
system using LLVM versus gcc. As you said, source level debugging 
information wasn't available for the LLVM binary but from looking at a 
profile of the code, there are two functions that take up a moderate 
amount of time (zip and file_read) in the LLVM binary but these 
functions are not in the profile of the gcc code. Is it likely that gcc 
would have inlined these? file_read is relatively small, but zip is a 
little bigger. I tried to test this theory by manually editing the gzip 
code to inline those two functions, eg

inline int zip( ...
inline int file_read ( ..

but when I profiled that new code, it still had those two functions in 
the profile. Does LLVM support inlining (or am I am idiot and tried to 
do it manually wrong)?

Patrick

On May 2, 2004, at 10:40 PM, Chris Lattner wrote:
> On Sun, 2 May 2004, Patrick Flanagan wrote:
>> Is there anything special flagwise that I would need to specify to 
>> tell
>> it to include symbol and debug information? I've tried specifying
-g
>> but this information still doesn't seem to be included. A quick
copy
>> of
>> the build of one of the tests to make sure I've got the flags
right:
>
> Nope.  Right now LLVM doesn't have real support for source-level
> debugging.  There is a debugger *started*, but it needs substantial 
> work
> before it can be usable, and the C front-end cannot produce debug
> information yet.  If you're interested in the debugger, it is discussed
> here:
> http://llvm.cs.uiuc.edu/docs/SourceLevelDebugging.html
>
> Sorry!
>
> -Chris
>
> -- 
> http://llvm.cs.uiuc.edu/
> http://www.nondot.org/~sabre/Projects/
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://mail.cs.uiuc.edu/mailman/listinfo/llvmdev
>

Chris Lattner

2004-May-04 20:59 UTC

head link

[LLVMdev] Testing LLVM on OS X

On Tue, 4 May 2004, Patrick Flanagan wrote:
> I was able to run through all the C/C++ benchmarks in SPEC using LLVM.
> I'm on OS X 10.3.3. I did a quick comparison between LLVM (latest from
> CVS as of 4/27) and gcc 3.3 (Apple's build 20030304).  For
simplicity's
> sake, the only flag I used was -O3 for each compiler and I was using
> the C backend to generate native code for PPC.
Okay, sounds great.  Are you using the -native-cbe option?  Or are you
running llc -march=c ... and GCC manually?
> Most of the LLVM results were close to gcc performance (within 5%), but
> a few of the tests caught my eye. 164.gzip ran about 25% slower on my
> system using LLVM versus gcc.
Hrm, I really want to figure this out!
> As you said, source level debugging information wasn't available for
the
> LLVM binary but from looking at a profile of the code, there are two
> functions that take up a moderate amount of time (zip and file_read) in
> the LLVM binary but these functions are not in the profile of the gcc
> code. Is it likely that gcc would have inlined these?
It's quite possible.  The best way to check is to look at the .s file
produced by GCC and see if they are there.  Note that GCC is much more
aggressive abount inlining than LLVM is.
> file_read is relatively small, but zip is a little bigger. I tried to
> test this theory by manually editing the gzip code to inline those two
> functions, eg
>
> inline int zip( ...
> inline int file_read ( ..
>
> but when I profiled that new code, it still had those two functions in
> the profile. Does LLVM support inlining (or am I am idiot and tried to
> do it manually wrong)?
LLVM supports inlining, and you're not an idiot.  :)  The problem is that
LLVM doesn't "listen" to "inline" hints at all right
now.  If you would
like to adjust the inlining thresholds, you can pass
-Wa,-inline-threshold=XXX or -Wl,-inline-threshold=XXX to set the
compile-time or link-time inlining thresholds, respectively.  These both
default to 200 (which has no units), if you increase it, the inliner will
inline more.

If you want to see what inlining decisions are being made, pass
-debug-only=inline (with -Wa, or -Wl,) to see what "choices" the
inliner
is making.

Note that, even without source-level debugging information, you can still
do performance investigation with LLVM.  You can either look at the C code
generated by the CBE (which will hurt your eyes: brace yourself), or you
can look at the LLVM code directly, which will be easier to handle (once
you get used to reading LLVM).

I suspect that a large reason that LLVM does worst than a native C
compiler with the CBE+GCC is that LLVM generates very low-level C code,
and I'm not convinced that GCC is doing a very good job (ie, without
syntactic loops).

Please let me know what you find!

-Chris
> On May 2, 2004, at 10:40 PM, Chris Lattner wrote:
>
> > On Sun, 2 May 2004, Patrick Flanagan wrote:
> >> Is there anything special flagwise that I would need to specify to
> >> tell
> >> it to include symbol and debug information? I've tried
specifying -g
> >> but this information still doesn't seem to be included. A
quick copy
> >> of
> >> the build of one of the tests to make sure I've got the flags
right:
> >
> > Nope.  Right now LLVM doesn't have real support for source-level
> > debugging.  There is a debugger *started*, but it needs substantial
> > work
> > before it can be usable, and the C front-end cannot produce debug
> > information yet.  If you're interested in the debugger, it is
discussed
> > here:
> > http://llvm.cs.uiuc.edu/docs/SourceLevelDebugging.html
> >
> > Sorry!
> >
> > -Chris
> >
> > --
> > http://llvm.cs.uiuc.edu/
> > http://www.nondot.org/~sabre/Projects/
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://mail.cs.uiuc.edu/mailman/listinfo/llvmdev
> >
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://mail.cs.uiuc.edu/mailman/listinfo/llvmdev
>
-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

Chris Lattner

2004-May-04 21:30 UTC

head link

[LLVMdev] Testing LLVM on OS X

On Tue, 4 May 2004, Chris Lattner wrote:> I suspect that a large reason that LLVM does worst than a native C
> compiler with the CBE+GCC is that LLVM generates very low-level C code,
> and I'm not convinced that GCC is doing a very good job (ie, without
> syntactic loops).
Yup, this is EXACTLY what is going on.

I took this very simple C function:

int Array[1000];
void test(int X) {
  int i;
  for (i = 0; i < 1000; ++i)
    Array[i] += X;
}

Compile with -O3 on OS/X gave me this:

_test:
        mflr r5
        bcl 20,31,"L00000000001$pb"
"L00000000001$pb":
        mflr r2
        mtlr r5
        addis r4,r2,ha16(L_Array$non_lazy_ptr-"L00000000001$pb")
        li r2,0
        lwz r9,lo16(L_Array$non_lazy_ptr-"L00000000001$pb")(r4)
        li r4,1000
        mtctr r4
L9:
        lwzx r7,r2,r9          ; load
        add r6,r7,r3           ; add
        stwx r6,r2,r9          ; store
        addi r2,r2,4           ; Increment pointer
        bdnz L9                ; Decrement count register, branch while not zero
        blr

This is nice code, good GCC.  :)

Okay, LLVM currently generates this code from the CBE:

void test(int l7_X) {
  unsigned l8_indvar;
  unsigned l8_indvar__PHI_TEMPORARY;
  int *l14_tmp_2E_5;
  int l7_tmp_2E_9;
  unsigned l8_indvar_2E_next;

  l8_indvar__PHI_TEMPORARY = 0u;   /* for PHI node */

l13_no_exit:
  l8_indvar = l8_indvar__PHI_TEMPORARY;
  l14_tmp_2E_5 = &Array[l8_indvar];
  l7_tmp_2E_9 = *l14_tmp_2E_5;
  *l14_tmp_2E_5 = (l7_tmp_2E_9 + l7_X);
  l8_indvar_2E_next = l8_indvar + 1u;
  if (!(l8_indvar_2E_next == 1000u)) {
    l8_indvar__PHI_TEMPORARY = l8_indvar_2E_next;   /* for PHI node */
    goto l13_no_exit;
  }
  return;
}

This has exactly the same operations in the loop, so GCC should produce
the same code, right?  Wrong:

_test:
        mflr r4
        bcl 20,31,"L00000000001$pb"
"L00000000001$pb":
        mflr r2
        mtlr r4
        li r11,0
        addis r10,r2,ha16(_Array-"L00000000001$pb")
L2:
        slwi r2,r11,2              ; Shift left "i" by 2
        la r5,lo16(_Array-"L00000000001$pb")(r10)
        cmpwi cr0,r11,999          ; compare i to the trip count
        lwzx r7,r2,r5              ; Load from array
        addi r11,r11,1             ; increment "i"
        add r6,r7,r3               ; Add value to array value
        stwx r6,r2,r5              ; store into array
        bne+ cr0,L2                ; Loop until done
        blr

Hrm, basically gcc is not doing ANY loop optimization (e.g.
strength reduction or "do-loop" optimization) what-so-ever.  I'm
sure that
the X86 GCC is suffering from the same problems, it's just that X86
doesn't depend on strength reduction and do-loop optimization as much, so
it's not so pronounced.

Interestingly, if I tweak the .cbe code to be this:

  do {
  l8_indvar = l8_indvar__PHI_TEMPORARY;
  l14_tmp_2E_5 = &Array[l8_indvar];
  l7_tmp_2E_9 = *l14_tmp_2E_5;
  *l14_tmp_2E_5 = (l7_tmp_2E_9 + l7_X);
  l8_indvar_2E_next = l8_indvar + 1u;
  l8_indvar__PHI_TEMPORARY = l8_indvar_2E_next;   /* for PHI node */
  } while (!(l8_indvar_2E_next == 1000u));

GCC generates the nice code again, virtually identical to the code from
the original source.  AAAH!  :)

Maybe this is a good argument for making the CBE generate syntactic loops
in simple cases.  I may have some time to try implementing this on the
weekend.  That is, if no one beats me to it.  :)

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

Apparently Analagous Threads

Search for more seemingly similar threads

llvm dev - May 2004 - [LLVMdev] Testing LLVM on OS X

[LLVMdev] Testing LLVM on OS X

[LLVMdev] Testing LLVM on OS X

[LLVMdev] Testing LLVM on OS X

Apparently Analagous Threads