thr3ads.net - llvm dev - [LLVMdev] Alias Analysis: zero terminated strings [Sep 2011]

If this information is useful, please help other people find it:
Share via:

Carl-Philip Hänsch

2011-Sep-12 13:57 UTC

[LLVMdev] Alias Analysis: zero terminated strings

Hello,

I'm developing a programming language that is optimized for strings. A first
hello world program shows me that llvm needs a lot more work on zero
terminated strings.
In the following example, I have an auto generated hello world example
optimized with -O3. The problem is, that the constant string is copied into
a malloced mem area, then puts is called and then the memory is freed. There
is also some leftover from the reference counters. These are found by the
dead code eliminaton after the puts call. But before the puts call, the
constant folded number is put into the memory and is never used. I was told
that llvm assumes that a function also can read below the pointer, so dead
code elimination does not work here. The second thing i would like to have
there is to tell LLVM that the interesting memory ends after the zero
termination. I think these two flags: dont_read_below and
dont_read_above_zero should be enough to make LLVM optimze that example.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110912/a990e36f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test2.ll
Type: application/octet-stream
Size: 1228 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110912/a990e36f/attachment.obj>

Nick Lewycky

2011-Sep-12 16:31 UTC

head link

[LLVMdev] Alias Analysis: zero terminated strings

Carl-Philip Hänsch wrote:> Hello,
>
> I'm developing a programming language that is optimized for strings. A
> first hello world program shows me that llvm needs a lot more work on
> zero terminated strings.
> In the following example, I have an auto generated hello world example
> optimized with -O3. The problem is, that the constant string is copied
> into a malloced mem area, then puts is called and then the memory is
> freed. There is also some leftover from the reference counters. These
> are found by the dead code eliminaton after the puts call. But before
> the puts call, the constant folded number is put into the memory and is
> never used. I was told that llvm assumes that a function also can read
> below the pointer, so dead code elimination does not work here. The
> second thing i would like to have there is to tell LLVM that the
> interesting memory ends after the zero termination. I think these two
> flags: dont_read_below and dont_read_above_zero should be enough to make
> LLVM optimze that example.
LLVM could figure out that there is no "below the pointer" by noticing
that the object came from malloc. I think the missing optimization here 
is a heap->stack transform.

I note that in your example the exit is not post-dominated by free(). 
The transform could still fire by noticing that the pointer returned by 
malloc never escaped the function. (A more expensive check would be to 
see that it never escaped the function along the path that didn't call 
free. This is related to http://llvm.org/PR8908#c1 .)

One other thing we may want is a flag for "does not care about the 
pointer itself, only what the pointer points to". Currently, nothing 
tells LLVM that puts() doesn't check whether the pointer argument == 
&string_00000001, so we can't actually remove the copy. We could 
special-case that optimization into SimplifyLibCalls.

Nick

Duncan Sands

2011-Sep-12 17:02 UTC

head link

[LLVMdev] Alias Analysis: zero terminated strings

Hi Nick,
>> I'm developing a programming language that is optimized for
strings. A
>> first hello world program shows me that llvm needs a lot more work on
>> zero terminated strings.
>> In the following example, I have an auto generated hello world example
>> optimized with -O3. The problem is, that the constant string is copied
>> into a malloced mem area, then puts is called and then the memory is
>> freed. There is also some leftover from the reference counters. These
>> are found by the dead code eliminaton after the puts call. But before
>> the puts call, the constant folded number is put into the memory and is
>> never used. I was told that llvm assumes that a function also can read
>> below the pointer, so dead code elimination does not work here. The
>> second thing i would like to have there is to tell LLVM that the
>> interesting memory ends after the zero termination. I think these two
>> flags: dont_read_below and dont_read_above_zero should be enough to
make
>> LLVM optimze that example.
>
> LLVM could figure out that there is no "below the pointer" by
noticing
> that the object came from malloc.
I think it's more complicated: there is code that writes something before
the string, so the malloc'd memory can't be considered to be all
"undef"
except for the string part.

Ciao, Duncan.

  I think the missing optimization here> is a heap->stack transform.
>
> I note that in your example the exit is not post-dominated by free().
> The transform could still fire by noticing that the pointer returned by
> malloc never escaped the function. (A more expensive check would be to
> see that it never escaped the function along the path that didn't call
> free. This is related to http://llvm.org/PR8908#c1 .)
>
> One other thing we may want is a flag for "does not care about the
> pointer itself, only what the pointer points to". Currently, nothing
> tells LLVM that puts() doesn't check whether the pointer argument =>
&string_00000001, so we can't actually remove the copy. We could
> special-case that optimization into SimplifyLibCalls.
>
> Nick
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Carl-Philip Hänsch

2011-Sep-12 18:52 UTC

head link

[LLVMdev] Alias Analysis: zero terminated strings

2011/9/12 Nick Lewycky <nicholas at mxc.ca>
> LLVM could figure out that there is no "below the pointer" by
noticing that
> the object came from malloc. I think the missing optimization here is a
> heap->stack transform.
>
That sounds good for me. I myself have to concentrate on the compiler, so I
dont have so much time to implement one of these things on my own in the
near future.

I note that in your example the exit is not post-dominated by free().
The> transform could still fire by noticing that the pointer returned by malloc
> never escaped the function. (A more expensive check would be to see that it
> never escaped the function along the path that didn't call free. This
is
> related to http://llvm.org/PR8908#c1 .)
>
But the pointer left the function. The pointer was sent to puts (not the
pointer itself, but a getelementptr of it). I think, the best way would be
to do a heap->stack when malloc and free are in the same control flow.

>
> One other thing we may want is a flag for "does not care about the
pointer
> itself, only what the pointer points to". Currently, nothing tells
LLVM that
> puts() doesn't check whether the pointer argument ==
&string_00000001, so we
> can't actually remove the copy. We could special-case that optimization
into
> SimplifyLibCalls.
>
> Nick
>
But: A heap->stack optimization would not remove the mem copying. The
parameter flags to determine a zero extended string are still needed.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20110912/b47e8516/attachment.html>

Maybe Matching Threads

Search for more apparently analagous threads

llvm dev - Sep 2011 - [LLVMdev] Alias Analysis: zero terminated strings

[LLVMdev] Alias Analysis: zero terminated strings

[LLVMdev] Alias Analysis: zero terminated strings

[LLVMdev] Alias Analysis: zero terminated strings

[LLVMdev] Alias Analysis: zero terminated strings

Maybe Matching Threads